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Calculus is the study of functional relationships and how related quantities change with 
each other. In your first exposure to calculus, the primary focus of your attention was 
on functions involving a single independent variable and a single dependent variable. For 
such a function /, a single real number input x determines a unique single output value 
f {x). However, many of the functions of importance both within mathematics itself as 
well as in the application of mathematics to the rest of the world involve many variables 
simultaneously. For example, frequently in physics the function which describes the force 
acting on an object moving in space depends on three variables, the three coordinates 
which describe the location of the object. If the force function also varies with time, 
then the force depends on four variables. Moreover, the output of the force function will 
itself involve three variables, the three coordinate components of the force. Hence the 
force function is such that it takes three, or four, variables for input and outputs three 
variables. Far more complicated functions are easy to imagine: the gross national product 
of a country is a function of thousands of variables with a single variable as output, an 
airline schedule is a function with thousands of inputs (cities, planes, and people to be 
scheduled, as well as other variables like fuel costs and the schedules of competing airlines) 
and perhaps hundreds of outputs (the particular routes flown, along with their times). 
Although such functions may at first appear to be far more difficult to work with than 
the functions of single variable calculus, we shall see that we will often be able to reduce 
problems involving functions of several variables to related problems involving only single 
variable functions, problems which we may then handle using already familiar techniques. 

By definition, a function takes a single input value and associates it with a single 
output value. Hence, even though in this book the inputs to our functions will often 
involve several variables, as will the outputs, we will nevertheless want to regard the input 
and output of a function as single points in some multidimensional space. This is natural 
in the case of, for example, the force function described above, where the input is a point 
in three dimensional space, four if we need to use time, but requires some mathematical 
abstraction if we want to consider the input to the gross national product function as a 
point in some space of many thousands of dimensions. Because even the geometry of two- 
and three-dimensional space may be in some respects new to you, we will use this chapter 
to study the geometry of multidimensional space before proceeding to the study of calculus 
proper in Chapter 2. 

Throughout the book we will let K denote the set of real numbers. 

Definition By n- dimensional Euclidean space we mean the set 
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R n = {[x 1 ,x 2 ,...,x n ) :x i 6R,i = l,2,...,Ti}. (1.1.1) 
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Figure 1.1.1 A point in K. 

That is, IR n is the space of all ordered n-tuples of real numbers. We will denote a point in 
this space by 

x= (x 1 ,x 2 ,...,x n ), (1.1.2) 
and, for % = 1, 2, . . . , n, we call X{ the ith coordinate of x. 
Example When n = 2, we have 

M 2 = {(x 1 ,x 2 ) : xi,x 2 e R}, 

which is our familiar representation for points in the Cartesian plane. As usual, we will 
in this case frequently label the coordinates as x and y, or something similar, instead of 
numbering them as X\ and x 2 . 

Example When n = 3, we have 



E 3 = {(x 1 ,X2,X 3 ) : Xi,X2,X3 e R}. 

Just as we can think of IR 2 as a way of assigning coordinates to points in the Euclidean 
plane, we can think of M 3 as assigning coordinates to three-dimensional Euclidean space. To 
picture this space, we must imagine three mutually perpendicular axes with the coordinates 
marked off along the axes as in Figure 1.1.1. Again, we will frequently label the coordinates 
of a point in M 3 as, for example, x, y, and z, or u, v, and w, rather than using numbered 
coordinates. 

Example If an object moves through space, its location may be specified with four 
coordinates, three spatial coordinate, say, x, y, and z, and one time coordinate, say t. 
Thus its location is specified by a point p = (x,y,z,t) in R 4 . Of course, we cannot draw 
a picture of such a point. 

Before beginning our geometric study of IR n , we first need a few basic algebraic defini- 
tions. 
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Definition Let x = (x±, X2, ■ ■ ■ , x n ) and y = (2/1, 7/2, • • - , 2/n) be points in IR n and let a 
be a real number. Then we define 

x + y = (xi + yi, x 2 + 2/2, ■ ■ ■ , x n + y n ), (1.1.3) 

x - y = {xi - 2/1, - 2/2, • • • , x n - y n ), (1-1-4) 

and 

ax = (axi, ax2, ■ ■ ■ , ax n ). (1.1.5) 

Example If x = (2, —3, 1) and y = (—4, 1, —2) are two points in IR 3 , then 

x + y = (-2,-2,-1), 

x-y = (6,-4,3), 
y-x= (-6,4,-3), 
3x = (6, -9,3), 

and 

-2y= (8,-2,4). 

Notice that we defined addition and subtraction for points in IR n , but we did not define 
multiplication. In general there is no form of multiplication for such points that is useful 
for our purpose. Of course, multiplication is defined in the special case n = 1 and for the 
special case n = 2 if we consider the points in M 2 as points in the complex plane. We 
shall see in Section 1.3 that there is also an interesting and useful type of multiplication 
in IR 3 . Also note that (1.1.5) does provide a method for multiplying a point in IR n by a 
a real number, the result being another point in R n . In such cases we often refer to the 
real number as a scalar and this multiplication as scalar multiplication. We shall provide 
a geometric interpretation of this form of multiplication shortly. 

Geometry of R n 

Recall that if x = (^1,^2) and y = (2/1,2/2) are two points in IR 2 , then, using the 
Pythagorean theorem, the distance from x to y is 

V(yi-xi) 2 + (y2-x 2 ) 2 . (1.1.6) 

This formula is easily generalized to IR 3 : Suppose x = (^1,^2,^3) and y = (2/1,2/2,2/3) are 
two points in IR 3 . Let z = (2/1,2/27^3)- Since the first two coordinates of y and z are the 
same, y and z lie on the same vertical line, and so the distance between them is simply 

\y3~x 3 \. (1-1-7) 

Moreover, x and z have the same third coordinate, and so lie in the same horizontal plane. 
Hence the distance between x and z is the same as the distance between (^1,^2) and 
(2/1,2/2) in IR 2 , that is, 

V(yi-xi) 2 + (y2-x2) 2 . (1.1.8) 
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(y P y 2 ,y 3 : 



(yi, y2,*3. 



Figure 1.1.2 Distance from x = (x±, X2, £3) and y = (2/1,2/2, 2/3) 



Finally, the points x, y, and z form a right triangle with right angle at z. Hence, using 
the Pythagorean theorem again, the distance from x to y is 



{yWl ~ x iY + (V2 ~ x 2 ) 2 ^ + 1 2/3 - x 3 \ 2 = a/ (2/1 - xi) 2 + (2/2 - x 2 ) 2 + (2/3 - x s ) 2 . 

In particular, if we let ||x|| denote the distance from x = (a;i, X2, £3) to the origin (0, 0, 0) 
in M 3 , then 



™2 _|_ ™2 I ^yi2 

\^ <Aj 2 1^ 3* 



With this notation, the distance from x to y is 



|y- x l 



[yi - £1,2/2 - £2,2/3 - %3j 



V (2/1 - £l) 2 + (2/2 - X 2 ) 2 + (2/3 - £3) 2 - 



1.1.9) 



;i.i.io) 



Example If x = (1, 2, —3) and y = (3, —2, 1), then the distance from x to the origin is 



'I 2 + 2 2 + (-3) 2 = V14 
and the distance from x to y is given by 

||y-x|| = ||(2,-4,4)|| = V4 + 16 + 16 = 6. 



Although we do not have any physical analogies to work with when n > 3, nevertheless 
we may generalize (1.1.9) in order to define distance in M. n . 

Definition If x = (xi,X2, • • • , x n ) is a point in IR n , we define the norm of x, denoted 
l|x||, by 

||x|| = ^Jx\ +x% H hx 2 . (1.1.11) 

For two points x and y in IR n , we define the distance between x and y, denoted d(x, y), 
by 

d(x,y) = ||y-x||. (1.1.12) 
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We will let = (0, 0, . . . , 0) denote the origin in lR n . Then we have 

||x|| = <i(x, 0); 

that is, the norm of x is the distance from x to the origin. 

Example If x = (2, 3, — 1,5), a point in M 4 , then the distance from x to the origin is 

||x|| = V4 + 9 + 1 + 25 = V39. 
If y = (3, 2, 1, 4), then the distance from x to y is 

d(x,y) = ||y-x|| = ||(1,-1,2,-1)|| = v / 7. 

Note that if x = (x±,X2, • • • ,x n ) is a point in IR n and a is a scalar, then 

||ax|| = \\(ax 1 ,ax2, ■ ■ ■,ax n )\\ 



= \ a 2 x\ + a 2 x\ + ■ ■ ■ + x 



\a\\/xl + x% + 



= ax. 



+ xi 



(1.1.13) 



That is, the norm of a scalar multiple of x is just the absolute value of the scalar times 
the norm of x. In particular, if x ^ 0, then 



1 



1 



1. 



That is, 



1 



is a unit distance from the origin. 

Definition Let p = (pi,P2, ■ ■ ■ ,Pn) be a point in M n and let r > be a real number. 
The set of all points (xi, X2, ■ ■ ■ , x n ) in IR n which satisfy the equation 



(x 1 - p x ) 2 + (x 2 - P2) 2 H \-{x n - p n f 



(1.1.14) 



is called an (n — 1)- dimensional sphere with radius r and center p, which we denote 
S n ~ 1 (p, r). The set of all points (x±, x 2 , ■ ■ ■ , x n ) in W 1 which satisfy the inequality 



(xi - pi) 2 + (x 2 - P2) 2 H r(x n - p n ) 2 < r' 



(1.1.15) 



is called an open n- dimensional ball with radius r and center p, which we denote S n (p, r). 
The set of all points (x±, X2, ■ ■ ■ , x n ) in IR n which satisfy the inequality 

(xi - pi) 2 + (x 2 - p 2 f + --- + (x n - p n ) 2 < r 2 (1.1.16) 
is called a closed n-dimensional ball with radius r and center p, which we denote -B n (p, r). 
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A sphere S n ~ 1 (p,r) is the set of all points which lie a fixed distance r from a fixed 
point p in IR n . Note that for n = 1, 5 <0 (p, r) consists of only two points, namely, the point 
p — r that lies a distance r to the left of p and the point p + r that lies a distance r to the 
right of p; -B 1 (p, r) is the open interval (p — r, p + r); and -B 1 (p, r) is the closed interval 
[p — r, p + r] . In this sense open and closed balls are natural analogs of open and closed 
intervals on the real line. For n = 2, a sphere is a circle, an open ball is a disk without its 
enclosing circle, and a closed ball is a disk along with its enclosing circle. 

Vectors 

Many of the quantities of interest in physics, such as velocities, accelerations, and forces, 
involve both a magnitude and a direction. For example, we might speak of a force of 
magnitude 10 newtons acting on an object at the origin in a plane at an angle of f with 
the horizontal. It is common to picture such a quantity as an arrow, with length given by 
the magnitude and with the tip pointing in the specified direction, and to refer to it as a 
vector. Now any point x = (xi,^), x 7^ 0, in M 2 specifies a vector in the plane, namely 
the vector starting at the origin and ending at x. The magnitude, or length, of such a 
vector is ||x|| and its direction is specified by the angle a that it makes with the horizontal 
axis or by the angle (3 that it makes with the vertical axis. Note that 



and 

cos(/3) = Tj — 17 

and that, although neither cos(a) nor cos(/3) uniquely determines the direction of the 
vector by itself, together they completely determine the direction. See Figure 1.1.4. 

In general, we may think of x = (jcj, xi-, ■ ■ ■ , x n ) either as a point in IR n or as a vector 
in W 1 , starting at the origin with length ||x||. If x 7^ 0, we say, in analogy with the case in 
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x = (x l ,x 2 ) 




Figure 1.1.4 A vector viewed as an arrow from = (0, 0) to x = {x\,X2) 



J , that the direction of x is the vector 



u = 



X\ 



X2 X r 

Ti rr ' ■ ■ ■ ' Ti 



(1.1.17) 



The coordinates of this vector u are called the direction cosines of x because we may think 
of 



as the cosine of the angle between the vector x and the kth axis for k = 1,2, ... ,n, an 
interpretation that will become clearer after our discussion of angles in IR n in the next 
section. Alternatively, we may think of u as a vector of unit length that points in the same 
direction as x. Any vector of length 1, such as u, is called a unit vector. We call the 
zero-vector since it has length 0. Note that does not have a direction. 

Example The vector x = (1, 2, — 2, 3) in M 4 has length ||x|| = y/l8 and direction 



u = 



;i,2,-2,3). 



It is now possible to give geometric meanings to our definitions of scalar multiplication, 
vector addition, and vector subtraction. First note that if x ^ and a > 0, then 



so ax has direction 



ax = ax 



-ax 



ax 



the same as x. Hence ax points in the same direction as x, but with length a times the 
length of x. If a < 0, then 
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2x 




Figure 1.1.5 Examples of scalar multiplication of a vector in 



so ax has direction 



-ax 



x. 



ax 



Hence, in this case, ax has the opposite direction of x with length \a\ times the length of 
x. See Figure 1.1.5 for examples in IR 2 . 

Next consider two vectors x = (xi,^) and y = (2/1,1/2) in ^ 2 and their sum 

z = x + y = (xi + yi, x 2 + 2/2)- 

Note that the tip of z is located x\ units horizontally and X2 units vertically from the tip 
of y. Geometrically, the tip of z is located at the tip of x if x were first translated parallel 
to itself so that its tail now coincided with the tip of y. Equivalently, we can view z as 
the diagonal of the parallelogram which has x and y for its sides. See Figure 1.1.6 for an 
example. 




Figure 1.1.6 Example of vector addition in 



Finally, consider two vectors x = (xi,^) and y = (2/1,2/2) in M 2 and their difference 

z = x - y = (xi - 2/1, x 2 - 2/2)- 
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x 2 - yi 



Figure 1.1.7 Example of vector subtraction in M 2 

Note that since the coordinates of z are just the differences in the coordinates of x and y, 
z has the magnitude and direction of an arrow pointing from the tip of y to the tip of x, as 
illustrated in Figure 1.1.7. In other words, we may picture z geometrically by translating 
an arrow drawn from the tip of y to the tip of z parallel to itself until its tail is at the 
origin. 

In the previous discussion it is tempting to think of the arrow from the tip of y to the 
tip of x as really being x — y , not just a parallel translate of x — y . In fact, it is convenient 
and useful to think of parallel translates of a given vector, that is, vectors which have the 
same direction and magnitude, but with their tails not at the origin, as all being the same 
vector, just drawn in different places in space. We shall see many instances where viewing 
vectors in this way significantly helps our understanding. 

Before closing this section, we need to call attention to some special vectors. 

Definition The vectors 

d = (1,0,0,. ..,0) 
e 2 = (0,l,0,...,0) 

(1.1.18) 

e n = (0,0,0,...,1) 
in M. n are called the standard basis vectors. 

Example In M 2 the standard basis vectors are ei = (1,0) and e 2 = (0, 1). Note that if 
x = (x,y) is any vector in M 2 , then 

x = (x, 0) + (0, y) = x(l, 0) + 2/(0, 1) = xe 1 + ye 2 . 

For example, (2, 5) = 2ei + 5e 2 . 

Example In IR 3 the standard basis vectors are ei = (1,0,0), e 2 = (0,1,0), and e3 = 
(0, 0, 1). Note that if x = (x, y, z) is any vector in IR 3 , then 

x = (x, 0, 0) + (0, y, 0) + (0, 0, z) = x(l, 0, 0) + y(0, 1, 0) + z(0, 0, 1) = xe x + ye 2 + ze 3 . 

For example, (1,2, —4) = ei + 2e 2 — 4e 3 . 
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The previous two examples are easily generalized to show that any vector in M. n may 
be written as a sum of scalar multiples of the standard basis vectors. Specifically, if 
x = (xi,X2, • • • , x n ), then we may write x as 

x = x±ei + X2&2 H \-x n e n . (1.1.19) 

We say that x is a linear combination of the standard basis vectors ei, e2, • • • , e n . It is also 
important to note that there is only one choice for the scalars in this linear combination. 
That is, for any vector x in IR n there is one and only one way to write x as a linear 
combination of the standard basis vectors. 

Notes on notation 

In this text, we will denote vectors using a plain bold font. This is a common convention, 
but not the only one used for denoting vectors. Another frequently used convention is to 
place arrows above a variable which denotes a vector. For example, one might write x for 
what we have been denoting x. 

It is also worth noting that in many books the standard basis vectors in IR 2 are denoted 
by i and j (or % and j ), and the standard basis vectors in M 3 by i, j, and k (or i, j, and 
k ) . Since this notation is not easy to extend to higher dimensions, we will not make much 
use of it. 

Problems 

1. Let x = (1, 2), y = (2,3), and z = (—2,4). For each of the following, plot the points 
x, y, z, and the indicated point w. 

(a) w = x + y (b) w = 2x — y 

(c) w = z — 2x (d) w = 3x + 2y — z 

2. Let x = (1, 3, —1), y = (3, 2, 1), and z = (—2, 4, —2). Compute each of the following. 

(a) x + y (b) x-z + 3y 

(c) 3z-2y (d) -3x + 4z 

3. Let x = (1,-1,2,3), y = (-2,3,1,-2), and z = (2,1,3,-4). Compute each of the 
following. 

(a) x - 2z (b) y + x - 3z 

(c) -3y - x + 4z (d) x + 3z - 4y 

4. Let x = (1, 2) and y = (—2, 3). Compute each of the following. 

( a ) ll x ll ( b ) ll x -y|l 

(c) l|3x|| (d) ||-4y|| 

5. Let x = (2, 3, —1), y = (2, —1, 5), and z = (3, —1, —2). Compute each of the following, 
(a) ||x|| (b) ||x + 2y|| 
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(c) || - 5x|| (d) ||x + y + z || 

6. Find the distances between the following pairs of points. 

(a) x = (3, 2), y = (-1, 3) (b) x = (1, 2, 1), y = (-2, -1, 3) 

(c) x = (4, 2, 1, -1), y = (1, 3, 2, -2) (d) z = (3, -3, 0), y = (-1, 2, -5) 

(e) w = (1, 2, 4, -2, 3, -1), u = (3, 2, 1, -3, 2, 1) 

7. Draw a picture of the following sets of points in M 2 . 

(a) ^((1,2), 1) (b) 5 2 ((1,2),1) (c) 5 2 ((1,2),1) 

8. Draw a picture of the following sets of points in R. 

(a) 5°(1,3) (b) B*(l,3) (c) £i(l,3) 

9. Describe the differences between S 2 ((l, 2, 1), 1), S 3 ((l, 2, 1), 1), and B 3 ((l, 2, 1), 1) in 
R 3 . 

10. Is the point (1, 4, 5) in the the open ball B 3 ((-l, 2, 3), 4)? 

11. Is the point (3, 2, -1, 4, 1) in the open ball S 5 ((l, 2, -4, 2, 3), 3)? 

12. Find the length and direction of the following vectors. 

(a) x = (2,l) (b) z = (1,1,-1) 

(c) x= (-1,2,3) (d) w= (1,-1,2,-3) 

13. Let x = (1, 3), y = (4, 1), and z = (2, —1). Plot x, y, and z. Also, show how to obtain 
each of the following geometrically. 

(a) w = x + y (b) w = y - x 

(c) w = 3z (d) w = -2z 

(e) w = ^z (f) w = x + y + z 

(g) w = x + 3z (h) w = x - Jy 

14. Suppose x = (x 1 , x 2 , ■ ■ ■ , x n ), y = (y 1 ,y 2 ,..., y n ), and z = {z u z 2 , ■ ■ ■ , z n ) are vectors 
in M. n and a, 6, and c are scalars. Verify the following. 

(a) x + y = y + x (b) x + (y + z) = (x + y) + z 

(c) a(x + y) = ax + ay (d) (a + b)x = ax + 6x 

(e) a(6x) = (a6)x (f) x + = x 

(g) lx = x (h) x + (— x) = 0, where — x = — lx 

15. Let u = (1, 1) and v = (—1, 1) be vectors in R 2 . 

(a) Let x = (2, 1). Find scalars a and b such that x = au + bv. Are a and b unique? 

(b) Let x = (x,y) be an arbitrary vector in R 2 . Show that there exist unique scalars 
a and b such that x = au + 6v. 
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(c) The result in (b) shows that u and v form a basis for R 2 which is different from the 
standard basis of ei and &2- Show that the vectors u = (1,1) and w = (—1,-1) 
do not form a basis for M 2 . (Hint: Show that there do not exist scalars a and b 
such that x = au + w when x = (2, 1).) 
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Suppose x = (x\,X2) and y = (2/1,2/2) are two vectors in IR 2 , neither of which is the zero 
vector 0. Let a and (3 be the angles between x and y and the positive horizontal axis, 
respectively, measured in the counterclockwise direction. Supposing a > /3, let 6 = a — j3. 
Then 9 is the angle between x and y measured in the counterclockwise direction, as shown 
in Figure 1.2.1. From the subtraction formula for cosine we have 



cos(6>) = cos(ct — j3) = cos(o;) cos(/3) + sin(a) sin(/3). 



(1.2.1) 



Now 



and 



cos(ct) 
cos(/3) 
sin(a) 

sin(/3) 



||x|| 

yi 
llyll 

x 2 



2/2 



Thus, we have 



cos(#) 



\ x \\\\y\ 



\ x \\\\y\ 



\ x \\\\y\ 



(1.2.2) 




Figure 1.2.1 The angle between two vectors 
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Example Let 9 be the smallest angle between x = (2,1) and y = (1,3), measured in 
the counterclockwise direction. Then, by (1.2.2), we must have 

cosW= W) + (D(3)_ 5 _ 1 



||x||||y|| ~ y/5VlO V2' 
Hence 



9 = cos -1 



1 \ _ 7T 



See Figure 1.2.2. 



With more work it is possible to show that if x = (xi,X2,xs) and y = (j/i, J/2, J/3) are 
two vectors in IR 3 , neither of which is the zero vector 0, and 9 is the smallest positive angle 
between x and y, then 

cos(0) = XlVl \^ Xm (1-2-3) 

The term which appears in the numerators in both (1.2.2) and (1.2.3) arises frequently, so 
we will give it a name. 

Definition If x = (xi, X2, • • • , x n ) and y = (yi,y2, • • • , Un) are vectors in M n , then the 
dot product of x and y, denoted x ■ y, is given by 

x • y = xiyx + x 2 y 2 H V x n y n . (1-2.4) 



Note that the dot product of two vectors is a scalar, not another vector. Because of 
this, the dot product is also called the scalar product. It is also an example of what is 
called an inner product and is often denoted by (x, y). 
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Example If x = (1, 2, -3, -2) and y = (-1, 2, 3, 5), then 

x • y = (1)(-1) + (2)(2) + (-3)(3) + (-2)(5) = -1 + 4- 9-10 = -16. 

The next proposition lists some useful properties of the dot product. 
Proposition For any vectors x, y, and z in M 71 and scalar a, 



and 



x • y = y • x, 

x- (y + z) =x-y + x-z, 
(ax) - y = a(x-y), 
- x = 0, 
x-x > 0, 
x • x = only if x = 0, 



(1.2.5) 

(1.2.6) 
(1.2.7) 
(1.2.8) 
(1.2.9) 
(1.2.10) 

(1.2.11) 



These properties are all easily verifiable using the properties of real numbers and the 
definition of the dot product and will be left to Problem 9 for you to check. 

At this point we can say that if x and y are two nonzero vectors in either R 2 or R 3 
and 9 is the smallest positive angle between x and y, then 

003(6*) = * . (1.2.12) 

We would like to be able to make the same statement about the angle between two vectors 
in any dimension, but we would first have to define what we mean by the angle between 
two vectors in IR n for n > 3. The simplest way to do this is to turn things around and use 
(1.2.12) to define the angle. However, in order for this to work we must first know that 

-i<-^y <i, 

ll x lllly|| 

since this is the range of values for the cosine function. This fact follows from the following 
inequality. 

Cauchy-Schwarz Inequality For all x and y in R n , 

|x-y| < ||x||||y||. (1.2.13) 

To see why this is so, first note that both sides of (1.2.13) are when y = 0, and hence 
are equal in this case. Assuming x and y are fixed vectors in R n , with y ^ 0, let t be a 
real number and consider the function 



/(*) = (x + ty).(x + ty). 



(1.2.14) 
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By (1.2.9), f{t) > for all t, while from (1.2.6), (1.2.7), and (1.2.11), we see that 

f(t) = x • x + x • ty + ty • x + ty ■ ty = ||x|| 2 + 2(x • y)t + \\y\\ 2 t 2 . (1.2.15) 

Hence / is a quadratic polynomial with at most one root. Since the roots of / are, as given 
by the quadratic formula, 

-2(x-y)± v /4(x-y) 2 -4||x|| 2 ||yp 

2||y|| 2 

it follows that we must have 

4(x • y) 2 - 4||x|| 2 ||y|| 2 < 0. (1.2.16) 

Thus 

(x-y) 2 < ||x|| 2 ||y|| 2 , (1.2.17) 

and so 

|x-y| < ||x||||y||. (1.2.18) 

Note that |x • y | = ||x|| ||y || if and only if there is some value of t for which f(t) = 0, which, 
by (1.2.8) and (1.2.10), happens if and only if x + ty = 0, that is, x = — ty, for some 
value of t. Moreover, if y = 0, then y = Ox for any x in M n . Hence, in either case, the 
Cauchy-Schwarz inequality becomes an equality if and only if either x is a scalar multiple 
of y or y is a scalar multiple of x. 

With the Cauchy-Schwarz inequality we have 

-1 < „ X „'„ y „ < 1 (1.2.19) 

ll x lllly|| 

for any nonzero vectors x and y in IR n . Thus we may now state the following definition. 
Definition If x and y are nonzero vectors in IR n , then we call 

^ = cos- 1 ( j*||'||y|| ) (1.2.20) 

the angle between x and y. 

Example Suppose x = (1, 2, 3) and y = (1, -2, 2). Then x-y = 1-4+6 = 3, ||x|| = y/li, 
and || y || = 3, so if is the angle between x and y, we have 

cos(6>) = — = — !=. 
3^14 Vu 

Hence, rounding to four decimal places, 

e = cos" 1 ( ~j= \ = 1-3002. 
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Example Suppose x = (2, —1, 3, 1) and y = (—2, 3, 1, —4). Then x y = —8, ||x|| = Vl~5, 
and ||y|| = v^30, so if 9 is the angle between x and y, we have, rounding to four decimal 
places, 

e = cos -1 ( ~^ ) = 1.9575. 



15V30 

Example Let x be a vector in IR n and let k = 1,2, ... ,n, be the angle between x 
and the kth axis. Then oif- is the angle between x and the standard basis vector ej~. Thus 

cos(a fc ) = — — n — - = ——. 

Il x llll e fc|| ll x ll 

That is, cos(cti), cos(ct2), • • • , cos(a n ) are the direction cosines of x as defined in Section 
1.1. For example, if x = (3, 1, 2)in M 3 , then ||x|| = y/l4 and the direction cosines of x are 

cos(ai) = 



14' 



COs(«2) = 



and 



giving us, to four decimal places, 



and 



cos(a 3 ) = 



14 

2 



14' 



at = 0.6405, 
a 2 = 1.3002, 

a 3 = 1.0069. 



Note that if x and y are nonzero vectors in R n with x ■ y = 0, then the angle between 
x and y is 

cos" 1 (0) = |. 
This is the motivation behind our next definition. 

Definition Vectors x and y in W 1 are said to be orthogonal (or perpendicular), denoted 
x _L y, if x • y = 0. 

It is a convenient convention of mathematics not to restrict the definition of orthog- 
onality to nonzero vectors. Hence it follows from the definition, and (1.2.8), that is 
orthogonal to every vector in IR n . Moreover, is the only vector in M n which has this 
property, a fact you will be asked to verify in Problem 12. 

Example The vectors x = (—1, —2) and y = (1, 2) are both orthogonal to z = (2, — 1) 
in M 2 . Note that y = — x and, in fact, any scalar multiple of x is orthogonal to z. 
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Example In IR 4 , x = (1, —1, 1, —1) is orthogonal to y = (1, 1, 1, 1). As in the previous 
example, any scalar multiple of x is orthogonal to y. 

Definition We say vectors x and y are parallel if x = ay for some scalar «^0. 

This definition says that vectors are parallel when one is a nonzero scalar multiple of 
the other. From our proof of the Cauchy-Schwarz inequality we know that it follows that 
if x and y are parallel, then |x ■ y| = ||x|| ||y |. Thus if 9 is the angle between x and y, 

003(6*) = „ X ,'„ y „ = ±1. 

Il x lllly|| 

That is, 9 = or 9 = it. Put another way, x and y either point in the same direction or 
they point in opposite directions. 

Example The vectors x = (1,-3) and y = (—2,6) are parallel since x = — ^y. Note 
that x • y = —20 and ||x||||y|| = v^lOV^O = 20, so x ■ y = — ||x||||y||. It follows that the 
angle between x and y is n. 

Two basic results about triangles in IR and IR are the triangle inequality (the sum 
of the lengths of two sides of a triangle is greater than or equal to the length of the third 
side) and the Pythagorean theorem (the sum of the squares of the lengths of the legs of a 
right triangle is equal to the square of the length of the other side). In terms of vectors in 
IR n , if we picture a vector x with its tail at the origin and a vector y with its tail at the 
tip of x as two sides of a triangle, then the remaining side is given by the vector x + y. 
Thus the triangle inequality may be stated as follows. 

Triangle inequality If x and y are vectors in IR n , then 

||x + y|| < ||x|| + ||y 

The first step in verifying (1.2.21) is to note that, using (1.2.11) and (1.2.6), 

l|x + y|| 2 = (x + y) • (x + y) 

= x-x + 2(x-y)+y-y 
= ||x|| 2 + 2(x-y) + ||y|| 2 

Since x • y < ||x||||y|| by the Cauchy-Schwarz inequality, it follows that 

||x + y|| 2 < |x|| 2 + 2||x||||y|| + ||y|| 2 = (||x|| + ||y||) 2 , 

from which we obtain the triangle inequality by taking square roots. 
Note that in (1.2.22) we have 

||x + y|| 2 = ||x|| 2 + ||y|| 2 

if and only if x • y = 0, that is, if and only if x _L y. Hence we have the following famous 
result. 



(1.2.21) 



(1.2.22) 
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Pythagorean theorem Vectors x and y in M n are orthogonal if and only if 

||x + y|| 2 = ||x|| 2 + ||y|| 2 . (1.2.23) 



Perhaps the most important application of the dot product is in finding the orthogonal 
projection of one vector onto another. This is illustrated in Figure 1.2.3, where w represents 
the projection of x onto y. The result of the projection is to break x into the sum of two 
components, w, which is parallel to y, and x — w, which is orthogonal to y, a procedure 
which is frequently very useful. To compute w, note that if 9 is the angle between x and 
y, then 

|x • y| 

II W II = ||x|| | C0S(6>)| = llxll J: — ^-n— 

N|||y 

where 




is the direction of y. Hence w = |x ■ u|u when < 9 < ^, which is when x ■ u > 0, and 
w = — |x-u|u when ^ < 9 < n, which is when x-u < 0. Thus, in either case, w = (x-u)u. 




x-u|, (1.2.24) 




Figure 1.2.3 Orthogonal projection 



Definition Given vectors x and y, y ^ 0, in IR n , the vector 

w=(x-u)u, (1.2.25) 

where u is the direction of y, is called the orthogonal projection, or simply projection, of x 
onto y. We also call w the component of x in the direction of y and x ■ u the coordinate 
of x in the direction of y. 

In the special case where y = e^, the kth standard basic vector, k = 1, 2, . . . , n, we see 
that the coordinate of x = (x\, X2, ■ ■ ■ , x n ) in the direction of y is just x • e k = x k , the kth 
coordinate of x. 
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Example Suppose x = (1, 2, 3) and y = (1, 4, 0). Then the direction of y is 

u=-^=(M,0), 



so the coordinate of x in the direction of y is 



1 / 9 
x u = -=(1 + 8 + 0) 



.'17 V17 

Thus the projection of x onto y is 

9 9 rt , / 9 36 

Problems 

1. Let x = (3, —2), y = (—2, 5), and z = (4, 1). Compute each of the following. 

(a) x • y (b) 2x • y 

(c) x-(3y-z) (d) -z-(x + 5y) 

2. Let x = (3, —2, 1), y = (—2,3, 5), and z = (—1,4, 1). Compute each of the following. 

(a) x • y (b) 2x • y 

(c) x-(3y-z) (d) -z-(x + 5y) 

3. Let x = (3, -2, 1, 2), y = (-2, 3, 4, -5), and z = (-1, 4, 1, -2). Compute each of the 
following. 

(a) x • y (b) 2x • y 

(c) x • (3y - z) (d) -z • (x + 5y) 

4. Find the angles between the following pairs of vectors. First find your answers in 
radians and then convert to degrees. 

(a) x = (1, 2), y = (2, 1) (b) z = (3, 1), w = (-3, 1) 

(c) x = (l,l,l),y = (-1,1,-1) (d) y = (-1,2,4), z = (2,3,-1) 

(e) x= (1,2,1,2), y= (2, 1,2,1) (f) x = (1, 2, 3, 4, 5), z = (5, 4, 3, 2, 1) 

5. The three points (2, 1), (1, 2), and (—2, 1) determine a triangle in M 2 . Find the measure 
of its three angles and verify that their sum is it. 

6. Given three points p, q, and r in R", the vectors q — p, r — p, and q — r describe the 
sides of the triangle with vertices at p, q, and r. For each of the following, find the 
measure of the three angles of the triangle with vertices at the given points. 

(a) p = (1,2,1), q= (-1,-1,2), r= (-1,3,-1) 

(b) p = (1,2,1,1), q= (-1,-1,2,3), r = (-1,3,-1,2) 
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7. For each of the following, find the angles between the given vector and the coordinate 
axes. 



8. For each of the following, find the coordinate of x in the direction of y and the projec- 
tion w of x onto y. In each case verify that y _L (x — w). 



9. Verify properties (1.2.5) through (1.2.11) of the dot product. 

10. If w is the projection of x onto y, verify that y is orthogonal to x — w. 

11. Write x = (1,2, —3) as the sum of two vectors, one parallel to y = (2,3, 1) and the 
other orthogonal to y. 

12. Suppose x is a vector with the property that x ■ y = for all vectors y in IR n , y ^ x. 
Show that it follows that x = 0. 



(a) x = (-2,3) 
(c) y= (2,3,1,-1) 



(b) w = (-1,2,1) 
(d) x= (1,2,3,4,5) 



(a) x = (-2,4),y = (4,l) 

(c) x = (-4,-3,1), y = (1,-1,6) 



(b) x= (4,1,4), y = (-1,3,1) 

(d) x= (1,2,4,-1), y = (2,-1,2,3) 
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The Cross Product 



As we noted in Section 1.1, there is no general way to define multiplication for vectors in 
IR n , with the product also being a vector of the same dimension, which is useful for our 
purposes in this book. However, in the special case of M 3 there is a product which we will 
find useful. One motivation for this product is to consider the following problem: Given 
two vectors x = (xi,X2,x 3 ) and y = (2/1 , 2/2? 2/3) m n °t parallel to one another, find a 
third vector w = (wi,w 2 , w 3 ) which is orthogonal to both x and y. Thus we want w-x = 
and w ■ y = 0, which means we need to solve the equations 

X1W1 + x 2 w 2 + X3W3 = (13 1) 

y\W\ + V2W 2 + V3W 3 = 

for w±, W2, and w 3 . Multiplying the first equation by 1/3 and the second by X3 gives us 

XlVsWi + X2V3W2 + X3y 3 W 3 = 

x 3 yiwi + X3y 2 w 2 + x 3 y 3 W3 = 0. 
Subtracting the second equation from the first, we have 

(xiy 3 - X3y 1 )w 1 + (x 2 y 3 - x 3 y 2 )w2 = 0. (1.3.3) 
One solution of (1.3.3) is given by setting 



;i.3.2) 



wi = x 2 y3 ~ X3y 2 

w 2 = -{xiyz - xzyi) = x 3 yi - xiy 3 . 



1.3.4) 



Finally, from the first equation in (1.3.1), we now have 

X3W3 = -xi(x 2 y 3 - x 3 y 2 ) - x 2 {x3yi - xiy 3 ) = xxx 3 y 2 - x 2 x 3 yi, (1.3.5) 
from which we obtain the solution 

w 3 = x\y% - x 2 yi- (1.3.6) 

The choices made in arriving at (1.3.4) and (1.3.6) are not unique, but they are the standard 
choices which define w as the cross or vector product of x and y. 

Definition Given vectors x = (x±, X2, x 3 ) and y = (y±, 2/2, y 3 ) in K 3 , the vector 

x x y = (x 2 y 3 - x 3 y 2 , x 3 yi - x x y 3 , x x y 2 - x 2 yi) (1.3.7) 
is called the cross product, or vector product, of x and y. 
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Example If x = (1, 2, 3) and y = (1, -1, 1), then 

x x y = (2 + 3, 3 - 1, -1 - 2) = (5, 2, -3). 

Note that 

x-(xxy) = 5 + 4- 9 = 

and 

y • (x x y) = 5 - 2 - 3 = 0, 
showing that x _L (x x y) and y _L (x x y) as claimed. It is also interesting to note that 

y x x = (-3 - 2, 1 - 3, 2 + 1) = (-5, -2, 3) = -(x x y). 

This last calculation holds in general for all vectors x and y in R . 

Q 

Proposition Suppose x, y, and z are vectors in R and a is any real number. Then 

xxy = -(yxx), (1.3.8) 

x x (y + z) = (x x y) + (x x z), (1.3.9) 

(x + y) x z = (x x z) + (y x z), (1.3.10) 

a(x x y) = (ax) x y = x x (ay), (1.3.11) 

and 

xx = 0. (1.3.12) 

Verification of these properties is straightforward and will be left to Problem 10. Also, 
notice that 

ei x e 2 = e 3 , (1.3.13) 
e 2 x e 3 = ei, (1.3.14) 

and 

e 3 x ei = e 2 ; (1.3.15) 

that is, the cross product of two standard basis vectors is either the other standard basis 
vector or its negative. Moreover, note that in these cases the cross product points in the 
direction your thumb would point if you were to wrap the fingers of your right hand from 
the first vector to the second. This is in fact always true and results in what is known 
as the right-hand rule for the orientation of the cross product, as shown in Figure 1.3.1. 
Hence given two vectors x and y, we can always determine the direction of x x y; to 
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Figure 1.3.1 The right-hand rule 

completely identify x x y geometrically, we need only to know its length. Now if 9 is the 
angle between x and y, then 

||x x y|| 2 = (x 2 y 3 - x 3 y 2 ) 2 + (x 3 y 1 - x t y 3 ) 2 + (x x y 2 - x 2 yi) 2 

= Avl ~ ^ 2 x 3 y 2 y 3 + x\y\ + x\y\ - Ixxx^yxy?, + x\y\ + x\y\ 

- 2x 1 x 2 y 1 y 2 + x\y\ 

l 2 i 2 i 2\/ 2, 2, 2\ / 2 2 , 22, 2 2\ 

= {x 1 +x 2 + x 3 ){y 1 +y 2 + y 3 ) - [x x y x + x 2 y 2 + x 3 y 3 ) 

- (2x 2 x 3 y 2 y 3 + 2x 1 x 3 y 1 y 3 + 2x 1 x 2 y 1 y 2 ) 

= (xj + xl + x\)(yl +y 2 + yl) - {x x yi + x 2 y 2 + x 3 y 3 ) 2 

= ||x|| 2 ||yf-(x.y) 2 

= ||x|| 2 ||y|| 2 -(||x||||y||cos(^)) 2 

= ||x|| 2 ||y|| 2 (l-cos 2 (60) 

= ||x|| 2 ||y|| 2 sin 2 (^). (1.3.16) 

Taking square roots, and noting that sin(#) > since, by the definition of the angle between 
two vectors, < 6 < n, we have the following result. 

Proposition If 8 is the angle between two vectors x and y in M 3 , then 

||x x y|| = ||x||||y|| sin(6>). (1.3.17) 




x 

Figure 1.3.2 Height of the parallelogram is h = ||y|| sin(#) 
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Figure 1.3.3 Parallelogram with vertices at (0, 0, 0), (6, 1, 1), (8, 5, 2), and (2, 4, 1) 

The last theorem has several interesting consequences. One of these comes from recog- 
nizing that if we draw a parallelogram with x and y as adjacent sides, as in Figure 1.3.2, 
then the height of the parallelogram is ||y|| sin(0), where 9 is the angle between x and y. 
Hence the area of the parallelogram is ||x||||y|| sin(0), which by (1.3.17) is ||x x y||. 

Proposition Suppose x and y are two vectors in K . Then the area of the parallelogram 
which has x and y for adjacent sides is ||x x y||. 

Example Consider the parallelogram P with vertices at (0, 0, 0), (6, 1, 1), (8, 5, 2), and 
(2, 4, 1). Two adjacent sides are specified by the vectors x = (6, 1, 1) and y = (2, 4, 1) (see 
Figure 1.3.3), so the area of P is 

||xxy|| = ||(l-4,2-6,24-2)|| = ||(-3, -4, 22) || = V509- 
See Figure 1.3.4 to see the relationship between x x y and P. 

Example Consider the parallelogram P in the plane with vertices at (1,1,), (3, 2), (4, 4), 
and (2, 3). Two adjacent sides are given by the vectors from (1, 1) to (3, 2), that is 

x = (3, 2) -(1,1) = (2,1), 

and from (1, 1) to (2,3), that is, 

y = (2,3) -(1,1) = (1,2). 

See Figure 1.3.5. However, since these vectors are in M 2 , not in R 3 , we cannot compute their 
cross product. To get around this, we consider the vectors w = (2, 1, 0) and v = (1, 2, 0) 
which are adjacent sides of the same parallelogram viewed as lying in M 3 . Then the area 
of P is given by 

|| wx v|| = 11(0,0,4-1)11 = ||(0,0,3)|| =3. 
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Figure 1.3.4 Parallelogram with adjacent sides x = (6, 1, 1) and y = (2,4, 1) 




12 3 4 

Figure 1.3.5 Parallelogram with vertices at (1,1), (3,2), (4,4), and (2,3) 

It is easy to extend the result of the previous theorem to computing the volume of 
a parallelepiped in M 3 . Suppose x, y, and z are adjacent edges of parallelepiped P, as 
shown in Figure 1.3.6. Then the volume V of P is ||x x y||, which is the area of the base, 
multiplied by the height of P, which is the length of the projection of z onto x x y. Since 
the latter is equal to 

x x y 
z ' li [7 » 

x x y 
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\ Z // ——f-7 

\/ y — 
X 

Figure 1.3.6 Parallelepiped with adjacent edges x, y, and z 

we have 

z-(xxy)|. (1.3.18) 

Proposition The volume of a parallelepiped with adjacent edges x, y, and z is |z-(xxy)|. 

Definition Given three vectors x, y, and z in R 3 , the quantity z • (x x y) is called the 
scalar triple product of x, y, and x. 

Example Let x = (1,4, 1), y = (—3, 1, 1), and z = (0, 1,5) be adjacent edges of paral- 
lelepiped P (see Figure 1.3.7). Then 

x x y = (4 - 1, -3 - 1, 1 + 12) = (3, -4, 13), 

so 

z-(xxy) =0-4 + 65 = 61. 

Hence the volume of P is 61. 

The final result of this section follows from (1.3.17) and the fact that the angle between 
parallel vectors is either or n. 

Proposition Vectors x and y in IR 3 are parallel if and only if x x y = 0. 
Note that, in particular, for any vector x in IR 3 , x x x = 



V 



x x y 



x x y 



x X y 
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Figure 1.3.7 Parallelepiped with adjacent edges x = (1, 4, 1), y = (—3, 1,1), z = (0, 1, 5) 
Problems 

1. For each of the following pairs of vectors x and y, find xxy and verify that x _L (x x y) 
and y JL (x X y). 

(a) x = (1,2,-1), y= (-2,3,-1) (b) x = (-2, 1, 4), y = (3, 1, 2) 

(c) x = (l,3,-2),y=(3,9,6) (d) x = (-1, 4, 1), y = (3, 2, -1) 

2. Find the area of the parallelogram in M 3 that has the vectors x = (2, 3, 1) and y = 
(—3, 3, 1) for adjacent sides. 

3. Find the area of the parallelogram in IR 2 that has the vectors x = (3, 1) and y = (1, 4) 
for adjacent sides. 

4. Find the area of the parallelogram in M 3 that has vertices at (1, 1, 1), (2, 3, 2), (—2, 4, 4), 
and (-3,2,3). 

5. Find the area of the parallelogram in M 2 that has vertices at (2, —1), (4, —2), (3,0), 
and (1, 1). 

6. Find the area of the triangle in M 3 that has vertices at (1, 1, 0, (2, 3, 1), and (—1, 3, 2). 

7. Find the area of the triangle in R 2 that has vertices at (—1, 2), (2, —1), and (1, 3). 

8. Find the volume of the parallelepiped that has the vectors x = (1, 2, 1), y = (—1,1,1), 
and z = (—1, —1, 6) for adjacent sides. 

9. A parallelepiped has base vertices at (1, 1, 1), (2, 3, 2), (—2, 4, 4), and (—3, 2, 3) and top 
vertices at (2,2,6), (3,4,7), (-1,5,9), and (-2,3,8). Find its volume. 
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10. Verify the properties of the cross product stated in (1.3.8) through (1.3.12). 

11. Since |z ■ (x x y)|, |y • (z x x)|, and |x ■ (y x z)|are all equal to the volume of a 
parallelepiped with adjacent edges x, y, and z, they should all have the same value. 
Show that in fact 

z • (x x y) = y • (z x x) = x • (y x z). 
How do these compare with z • (y x z), y ■ (z x x), and x • (z x y)? 

12. Suppose x and y are parallel vectors in M 3 . Show directly from the definition of the 
cross product that x x y = 0. 

13. Show by example that the cross product is not associative. That is, find vectors x, y, 
and z such that 

x x (y x z) ^ (x x y) x z. 



The Calculus of Functions 
of 

Several Variables 



Section 1.4 



Lines, Planes, and Hyperplanes 



In this section we will add to our basic geometric understanding of lR n by studying lines 
and planes. If we do this carefully, we shall see that working with lines and planes in W 1 
is no more difficult than working with them in IR 2 or M 3 . 

Lines in R n 

We will start with lines. Recall from Section 1.1 that if v is a nonzero vector in IR n , then, 
for any scalar t, tv has the same direction as v when t > and the opposite direction 
when t < 0. Hence the set of points 

{tv : — oo < t < oo} 

forms a line through the origin. If we now add a vector p to each of these points, we obtain 
the set of points 

{tv + p : — oo < t < oo}, 
which is a line through p in the direction of v, as illustrated in Figure 1.4.1 for M 2 . 




Figure 1.4.1 A line in M 2 through p in the direction of v 



Definition Given a vector p and a nonzero vector v in IR n , the set of all points y in IR n 
such that 

y = tv + p, (1.4.1) 
where — oo < t < oo, is called the line through p in the direction of v. 
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Figure 1.4.2 The line through p = (1, 2) in the direction of v = (1, —3) 



Equation (1.4.1) is called a vector equation for the line. If we write y = (y\, y 2 , ■ ■ ■ , y n ), 
v = (t>i, V2, ■ ■ ■ , v n ), and p = (pi,p 2 , • • • ,Pn), then (1.4.1) may be written as 

(2/1,2/2, ■ ■ • ,Vn) = t(v!,V 2 , ••• ,V n ) + (Pl,P2, • • • ,Pn), ( 1A - 2 ) 

which holds if and only if 

2/i = tvx +pi, 
y 2 = tv 2 +p 2 , 

(1.4.3) 

Vn = tv n +p n . 

The equations in (1.4.3) are called parametric equations for the line. 

Example Suppose L is the line in IR 2 through p = (1, 2) in the direction of v = (1, —3) 
(see Figure 1.4.2). Then 

y = i(l,-3) + (l,2) = (t + l,-3£ + 2) 
is a vector equation for L and, if we let y = (x, y), 



x = t + 1, 
y = -St + 2 
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Figure 1.4.3 The line through p = (1, 3, 1) and q = (—1, 1, 4) 



are parametric equations for L. Note that if we solve for t in both of these equations, we 
have 

t = x — 1, 



Thus 



t 



x — 1 



2-y 
3 

2-y 



and so 



y = — 3x + 5. 



Of course, the latter is just the standard slope-intercept form for the equation of a line in 
M 2 . 

Example Now suppose we wish to find an equation for the line L in IR 3 which passes 
through the points p = (1, 3, 1) and q = (—1, 1,4) (see Figure 1.4.3). We first note that 
the vector 

p-q= (2,2,-3) 

gives the direction of the line, so 

y = t(2,2,-3) + (l,3,l) 
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Figure 1.4.4 Distance from a point q to a line 



is a vector equation for L; if we let y 



x = 2t + l, 

y = 2t + 3, 

z = -3t + 1 

are parametric equations for L. 

As an application of these ideas, consider the problem of finding the shortest distance 
from a point q in M. n to a line L with equation y = £v + p. If we let w be the projection 
of q — p onto v, then, as we saw in Section 1.2, the vector (q — p) — w is orthogonal to v 
and may be pictured with its tail on L and its tip at q. Hence the shortest distance from 
q to L is || (q — p) — w||. See Figure 1.4.4. 

Example To find the distance from the point q = (2, 2, 4) to the line L through the 
points p = (1,0,0) and r = (0,1,0), we must first find an equation for L. Since the 
direction of L is given by v = r — p = (—1,1,0), a vector equation for L is 



y = *(-!, 1,0) + (1,0,0). 



If we let 



u 



-1,1,0), 



then the projection of q — p onto v is 



w = ((q-p).u)u= ( (1,2,4)— (-1,1,0)) J ^|(-M,0) = ^(-1,1,0). 
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Figure 1.4.5 Parallel (L and M) and perpendicular (L and N) lines 

Thus the distance from q to L is 

||(q-p) -w|| = 



3 3 

2' 2' 




Definition Suppose L and M are lines in IR n with equations y = £v + p and y = £w + q, 
respectively. We say L and M are parallel if v and w are parallel. We say L and M are 
perpendicular, or orthogonal, if they intersect and v and w are orthogonal. 

Note that, by definition, a line is parallel to itself. 

Example The lines L and M in M 3 with equations 



y = £(1,2,-1) + (4, 1,2) 



and 



y = £(-2,-4, 2) + (5, 6,1), 

respectively, are parallel since (—2, —4, 2) = —2(1, 2, —1), that is, the vectors (1, 2, —1) and 
(—2, —4, 2) are parallel. See Figure 1.4.5. 

Example The lines L and N in M 3 with equations 

y = £(1,2,-1) + (4, 1,2) 

and 

y = £(3, -1,1) + (-1,5,-1), 

respectively, are perpendicular since they intersect at (5, 3, 1) (when t = 1 for the first line 
and t = 2 for the second line) and (1, 2, —1) and (3, —1, 1) are orthogonal since 



(1,2,-1). (3, -l,l) = 3-2-l = 0. 



See Figure 1.4.5. 
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Planes in R n 

The following definition is the first step in defining a plane. 

Definition Two vectors x and y in IR n are said to be linearly independent if neither one 
is a scalar multiple of the other. 

Geometrically, x and y are linearly independent if they do not lie on the same line 
through the origin. Notice that for any vector x, and x are not linearly independent, 
that is, they are linearly dependent, since = Ox. 

Definition Given a vector p along with linearly independent vectors v and w, all in R n , 
the set of all points y such that 

y = tv + sw + p, (1-4.4) 

where — oo < t < oo and — oo < s < oo, is called a plane. 

The intuition here is that a plane should be a two dimensional object, which is 
guaranteed because of the requirement that v and w are linearly independent. Also 
note that if we let y = (yi,y 2 , ■ ■ ■ ,y n ), v = (vi,v 2 , . . -,v n ), w = (wi,w 2 , . . .,w n ), and 
P = {PiiP2i ■ ■ ■ ,Pn)i then (1.4.4) implies that 

Vi = tvi + sw 1 +p-i_, 
y 2 = tv 2 + sw 2 +p 2 , 

(1.4.5) 

y n = tv n + sw n +p n . 

As with lines, (1.4.4) is a vector equation for the plane and the equations in (1.4.5) are 
parametric equations for the plane. 

Example Suppose we wish to find an equation for the plane P in M 3 which contains the 
three points p = (1, 2, 1), q = (—1, 3, 2), and r = (2, 3, —1). The first step is to find two 
linearly independent vectors v and w which lie in the plane. Since P must contain the 
line segments from p to q and from p to r, we can take 

v = q-p = (-2,1,1) 

and 

w = r-p= (1,1,-2). 

Note that v and w are linearly independent, a consequence of p, q, and r not all lying on 
the same line. See Figure 1.4.6. We may now write a vector equation for P as 

y = t(-2,l,l) + s(l, 1,-2) + (1,2,1). 

Note that y = p when t = and s = 0, y = q when t = 1 and s = 0, and y = r when 
t = and s = 1. If we write y = (x, y, z), then, expanding the vector equation, 



(x,y,z) =£(-2,1,1) +s(l, 1,-2) + (1,2,1) = (-2t + s + l,t + s + 2,t - 2s + 1), 




giving us 



x = -It + S + 1, 
y = t + s + 2, 
z = t - 2s + 1 



for parametric equations for P. 



To find the shortest distance from a point q to a plane P, we first need to consider the 
problem of finding the projection of a vector onto a plane. To begin, consider the plane P 
through the origin with equation y = ta + sh where ||a|| = 1, ||6|| = 1, and a _L b. Given 
a vector q not in P, let 

r = (q a)a + (q b)b, 
the sum of the projections of q onto a and onto b. Then 

(q — r) ■ a = q ■ a — r a 

= q • a - (q • a) (a • a) - (q • b)(b • a) 
= q - a — q • a = 0. 

since a • a = ||a|| 2 = 1 and b ■ a = 0, and, similarly, 

(q — r) • b = q • b — r b 

= qb-(qa)(ab)-(qb)(bb) 
= q • b - q • b = 0. 
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Figure 1.4.7 Distance from a point q to a plane 

It follows that for any y = ta. + sh in the plane P, 

(q — r) • y = (q — r) • (ta. + sb) = t(q — r) • a + s(q — r) • b = 0. 

That is, q — r is orthogonal to every vector in the plane P. For this reason, we call r the 
projection of q onto the plane P, and we note that the shortest distance from q to P is 
||q — r||. 

In the general case, given a point q and a plane P with equation y = tv + sw + p, 
we need only find vectors a and b such that a _L b, ||a|| = 1, ||6|| = 1, and the equation 
y = ta + sb + p describes the same plane P. You are asked in Problem 29 to verify that 
if we let c be the projection of w onto v, then we may take 

1 

a = - — - v 
||v|| 

and 

b = j. rrfw — C). 

||w — c|| 

If r is the sum of the projections of q — p onto a and b, then r is the projection of q — p 
onto P and ||(q — p) — r|| is the shortest distance from q to P. See Figure 1.4.7. 

Example To compute the distance from the point q = (2, 3, 3) to the plane P with 
equation 

y = t(-2,l,0) + s(l, -1,1) + (-1,2,1), 

let v = (—2, 1, 0), w = (1, —1, 1), and p = (—1, 2, 1). Then, using the above notation, we 
have 

1 . 
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w-c= |(-l,-2,5), 

and 

b= -L(-l,-2,5). 
V30 



Since q — p = (3, 1, 2), the projection of q — p onto P is 

r = ((3, 1, 2) ■ a)a + ((3, 1, 2) • b)b = -(-2, 1, 0) + ^(-1, -2, 5) = |(11, -8, 5) 

and 

( q - p )-r=i(7, 14,7). 
Hence the distance from q to P is 

v/294 7 



(q- p) - r || = 



6 v^" 



More generally, we say vectors vi, V2, • • • , v& in M. n are linearly independent if no one 
of them can be written as a sum of scalar multiples of the others. Given a vector p and 
linearly independent vectors vi, v 2 , . . . , we call the set of all points y such that 

y = *ivi + t 2 v 2 H h t k v k + p, 

where — oo < tj < oo, j = 1,2, ... ,k, a k- dimensional affine subspace of R n . In this 
terminology, a line is a 1-dimensional affine subspace and a plane is a 2-dimensional affine 
subspace. In the following, we will be interested primarily in lines and planes and so will 
not develop the details of the more general situation at this time. 

Hyperplanes 

Consider the set L of all points y = (x, y) in M 2 which satisfy the equation 

ax + by + d = 0, (1.4.6) 

where a, b, and d are scalars with at least one of a and b not being 0. If, for example, 
b 7^ 0, then we can solve for y, obtaining 

ad t 
y=--x--. (1.4.7) 

If we set x = t, —oo < t < oo, then the solutions to (1.4.6) are 

y = {x , y) =U-l t -l)= t ( h -l) + U-f). (,4.B) 
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Thus L is a line through (0, — tm in the direction of (l, — f )■ A similar calculation shows 
that if a 7^ 0, then we can describe L as the line through (— f,0) in the direction of 
(— -, l). Hence in either case L is a line in IR 2 . 

Now let n = (a, 6) and note that (1.4.6) is equivalent to 

n-y + d = 0. (1.4.9) 
Moreover, if p = (pi,P2) is a point on L, then 

n-p + d = 0, (1.4.10) 
which implies that d = — n ■ p. Thus we may write (1.4.9) as 



n ■ y - n ■ p = 0, 



and so we see that (1.4.6) is equivalent to the equation 

n-(y-p) = 0. (1.4.11) 

Equation (1.4.11) is a normal equation for the line L and n is a normal vector for L. In 
words, (1.4.11) says that the line L consists of all points in M 2 whose difference with p is 
orthogonal to n. See Figure 1.4.8. 

Example Suppose L is a line in M 2 with equation 



2x + 3y = 1. 
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Then a normal vector for L is n = (2, 3); to find a point on L, we note that when x = 2, 
y = —1, so p = (2, —1) is a point on L. Thus 

(2,3) • ((x,y)- (2,-1)) = 0, 

or, equivalently, 

(2,3)-(x-2,y + l) = 0, 

is a normal equation for L. Since q = (— 1, 1) is also a point on L, L has direction 
q - p = (-3,2). Thus 

y = t(-3,2) + (2,-l) 
is a vector equation for L. Note that 

n.(q-p) = (2,3)-(-3,2) = 0, 

so n is orthogonal to q — p. 

Example If L is a line in M 2 through p = (2,3) in the direction of v = ( — 1,2), then 
n = (2, 1) is a normal vector for L since v ■ n = 0. Thus 

(2,l)-(x-2,y-3) = 
is a normal equation for L. Multiplying this out, we have 

2(x-2) + (y-3) = 0; 
that is, L consists of all points (x,y) in IR 2 which satisfy 

2x + y = 7. 

Now consider the case where P is the set of all points y = (x, y, z) in M 3 that satisfy 
the equation 

ax + by + cz + d = 0, (1.4.12) 

where a, 6, c, and d are scalars with at least one of a, 6, and c not being 0. If for example, 
a ^ 0, then we may solve for x to obtain 

bed , „ „ s 

x = --y- -z- -. (1.4.13) 
a a a 

If we set y = t, — oo < t < oo, and z = s, — oo < s < oo, the solutions to (1.4.12) are 



y = (x,y,z) 

c e d + \ 

(1.4.14) 



bed 

-t s , t, s 

a a a 



= t 



(-S- 1 - ) + »(-S' ' 1 ) + (-S' ' )- 
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Figure 1.4.9 P is the set of points y for which y — p is orthogonal to n 

Thus we see that P is a plane in M 3 . In analogy with the case of lines in IR 2 , if we let 
n = (a, 6, c) and let p = (pi,P2,P3) be a point on P, then we have 

n • p + d = ax + by + cz + d = 0, 

from which we see that n ■ p = —d, and so we may write (1.4.12) as 

n-(y-p) = 0. (1.4.15) 

We call (1.4.15) a normal equation for P and we call n a normal vector for P. In words, 
(1.4.15) says that the plane P consists of all points in IR 3 whose difference with p is 
orthogonal to n. See Figure 1.4.9. 

Example Let P be the plane in IR 3 with vector equation 

y = t(2,2,-l) + a(-l,2,l) + (l,l,2). 
If we let v = (2, 2, -1) and w = (-1, 2, 1), then 

n = v x w = (4, -1,6) 
is orthogonal to both v and w. Now if y is on P, then 

y = tv + sw + p 
for some scalars t and s, from which we see that 

n ■ (y — p) = n • (tv + sw) = t(n ■ v) + s(n ■ w) = + = 0. 
That is, n is a normal vector for P. So, letting y = (x, y, z), 

(4,-l,6)-(x-l,2/-l,z-2) = (1.4.16) 
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is a normal equation for P. Multiplying (1.4.16) out, we see that P consists of all points 
(x,y,z) in M 3 which satisfy 

Ax — y + 6z = 15. 

Example Suppose p = (1,2,1), q = (—2,-1,3), and r = (2,-3,-1) are three points 
on a plane P in M 3 . Then 

v = q - p = (-3, -3,2) 

and 

w = r-p = (1,-5,-2) 

are vectors lying on P. Thus 

n = vx w = (16,-4,18) 

is a normal vector for P. Hence 

(16, -4, 18) • (x - 1, y - 2, z - 1) = 
is a normal equation for P. Thus P is the set of all points (x, y, z) in M 3 satisfying 

16x -4y + 18y = 26. 

The following definition generalizes the ideas in the previous examples. 

Definition Suppose n and p are vectors in M n with n^O. The set of all vectors y in 
R n which satisfy the equation 

n.(y- P )=0 (1.4.17) 

is called a hyperplane through the point p. We call n a normal vector for the hyperplane 
and we call (1.4.17) a normal equation for the hyperplane. 

In this terminology, a line in M 2 is a hyperplane and a plane in M 3 is a hyperplane. In 
general, a hyperplane in IR n is an (n — l)-dimensional affine subspace of M 71 . Also, note 
that if we let n = (ai, a 2 , . . . , a n ), p = (pi,P2, ■ ■ ■ ,Pn), and y = (yi, y 2 , . . . , y n ), then we 
may write (1.4.17) as 

aiiyi ~ Pi) + «2 (?/2 — P2) H h an(yn - Pn) = 0, (1.4.18) 

or 

aiyi+a 2 y 2 ^ ha n y n + d = (1.4.19) 

where d = — n ■ p. 

Example The set of all points (w, x, y, z) in IR 4 which satisfy 

3w — x + 4y + 2z = 5 
is a 3-dimensional hyperplane with normal vector n = (3,-1, 4, 2). 
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The normal equation description of a hyperplane simplifies a number of geometric 
calculations. For example, given a hyperplane H through p with normal vector n and a 
point q in IR n , the distance from q to H is simply the length of the projection of q — p 
onto n. Thus if u is the direction of n, then the distance from q to H is |(q — p) ■ u|. See 
Figure 1.4.10. Moreover, if we let d = — p • n as in (1.4.19), then we have 

i/ \ i i i q-n-p-n |q-n + d| , . 

(q — p) • u = q • u — p ■ u = n — n = n — n . (1.4.20) 

||n|| 1 1 xx 1 1 

Note that, in particular, (1.4.20) may be used to find the distance from a point to a line 
in M 2 and from a point to a plane in M 3 . 

Example To find the distance from the point q = (2, 3, 3) to the plane P in IR 3 with 
equation 

x + 2y + z = 4, 

we first note that xx = (1, 2, 1) is a normal vector for P. Using (1.4.20) with d = —4, we 
see that the distance from q to P is 

|q-n + d| |(2,3,3)- (1,2,1) -4| 7 



ll n ll y/6 VE 

Note that this agrees with an earlier example. 

We will close this section with a few words about angles between hyperplanes. Note 
that a hyperplane does not have a unique normal vector. In particular, if xx is a normal 
vector for a hyperplane H, then — xx is also a normal vector for H. Hence it is always 
possible to choose the normal vectors required in the following definition. 

Definition Let G and H be hyperplanes in M. n with normal equations 



m ■ (y - p) = 
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and 

n • (y - q) = 0, 

respectively, chosen so that m ■ n > 0. Then the angle between G and H is the angle 
between m and n. Moreover, we will say that G and H are orthogonal if m and n are 
orthogonal and we will say G and H are parallel if m and n are parallel. 

The effect of the choice of normal vectors in the definition is to make the angle between 
the two hyperplanes be between and ^. 

Example To find the angle 9 between the two planes in IR 3 with equations 

x + 2y — z = 3 

and 

x — 3y — z = 5, 

we first note that the corresponding normal vectors are m = (1, 2, — 1) and n = (1,-3,-1). 
Since m • n = —4, we will compute the angle between m and — n. Hence 

m-(-n) 4 4 

cos(6>) = — — 



|m||||n|| V6v^I v/66' 
Thus, rounding to four decimal places, 

= cos -1 I ^= ) = 1.0560. 



See Figure 1.4.11. 

Example The planes in M 3 with equations 

3x + y — 2z = 3 

and 

6x + 2y-4z = 13 

are parallel since their normal vectors are m = (3, 1, —2) and n = (6, 2, —4) and n = 2m. 
Problems 

1. Find vector and parametric equations for the line in M 2 through p = (2,3) in the 
direction of v = (1, —2). 

2. Find vector and parametric equations for the line in E 4 through p = (1,-1,2,3) in 
the direction of v = (—2, 3, —4, 1). 

3. Find vector and parametric equations for the lines passing through the following pairs 
of points. 
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(a) p = (-1, -3), q = (4, 2) (b) p = (2, 1, 3), q = (-1, 2, 1) 

(c) p = (3,2,l,4),q=(2,0,4,l) (d) p = (4, -3, 2), q = (1, -2, 4) 

4. Find the distance from the point q = (1,3) to the line with vector equation y 
t(2,l) + (3,l). 
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5. Find the distance from the point q = (1,3, —2) to the line with vector equation y = 
t(2, -1,4) + (1,-2,-1). 

6. Find the distance from the point r = (—1,2, —3) to the line through the points p = 
(1,0,1) and q= (0,2,-1). 

7. Find the distance from the point r = (—1,-2,2,4) to the line through the points 
p = (2, 1, 1, 2) and q = (1, 2, -4, 3). 

8. Find vector and parametric equations for the plane in R which contains the points 
p = (1, 3, -1), q = (-2, 1, 1), and r = (2, -3, 2). 

9. Find vector and parametric equations for the plane in IR 4 which contains the points 
p = (2, -3, 4, -1), q = (-1, 3, 2, -4), and r = (2, -1, 2, 1). 

10. Let P be the plane in IR 3 with vector equation y = t(l, 2, 1) + s(-2, 1, 3) + (1, 0, 1). 
Find the distance from the point q = (1, 3, 1) to P. 

11. Let P be the plane in IR 4 with vector equation y = -2, 1, 4)+s(2, 1, 2, 3) + (l, 0, 1, 0). 
Find the distance from the point q = (1, 3, 1, 3) to P. 

12. Find a normal vector and a normal equation for the line in IR 2 with vector equation 
y = t(l,2) + (l,-l). 

13. Find a normal vector and a normal equation for the line in IR 2 with vector equation 
y = t(0,l) + (2,0). 

14. Find a normal vector and a normal equation for the plane in IR with vector equation 
y = t(l,2,l) + s(3,l,-l) + (l,-l,l). 

15. Find a normal vector and a normal equation for the line in IR 2 which passes through 
the points p = (3, 2) and q = (—1, 3). 

16. Find a normal vector and a normal equation for the plane in IR which passes through 
the points p = (1, 2, -1), q = (-1, 3, 1), and r = (2, -2, 2). 

17. Find the distance from the point q = (3, 2) in IR 2 to the line with equation x+2y— 3 = 0. 

18. Find the distance from the point q = (1,2,-1) in IR 3 to the plane with equation 
x + 2y - 3x = 4. 

19. Find the distance from the point q = (3, 2, 1, 1) in IR 4 to the hyperplane with equation 
3x + y — 2z + 3w = 15. 

20. Find the angle between the lines in IR with equations 3x + y = 4 and x — y = 5. 

21. Find the angle between the planes in IR 3 with equations 3x — y + 2z = 5 and x — 2y + z = 
4. 

22. Find the angle between the hyperplanes in IR 4 with equations w + x + y — z = 3 and 
2w — x + 2y + z = 6. 

23. Find an equation for a plane in IR orthogonal to the plane with equation x+2y — 3z = 4 
and passing through the point p = (1,-1,2). 
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24. Find an equation for the plane in M 3 which is parallel to the plane x — y + 2z = 6 and 
passes through the point p = (2, 1, 2). 

25. Show that if x, y, and z are vectors in IR n with x _L y and x _L z, then x _L (ay + bz) 
for any scalars a and b. 

26. Find parametric equations for the line of intersection of the planes in M 3 with equations 
x + 2y — 6z = 4 and 2x — y + z = 2. 

27. Find parametric equations for the plane of intersection of the hyperplanes in M 4 with 
equations w — x + y + z = 3 and 2w + 4x — y + 2z = 8. 

28. Let L be the line in M 3 with vector equation y = £(1, 2, —1) + (3, 2, 1) and let P be the 
plane in M 3 with equation x + 2y — 3z = 8. Find the point where L intersects P. 

29. Let P be the plane in IR n with vector equation y = tv + sw + p. Let c be the projection 
of w onto v, 

1 

a = FIT v ' 
\\v\\ 

and 

b = -(w — c). 

||w — c || 

Show that y = £a + sb + p is also a vector equation for P. 
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Linear and Affine Functions 



One of the central themes of calculus is the approximation of nonlinear functions by linear 
functions, with the fundamental concept being the derivative of a function. This section 
will introduce the linear and affine functions which will be key to understanding derivatives 
in the chapters ahead. 

Linear functions 

In the following, we will use the notation / : IR m — > IR n to indicate a function whose 
domain is a subset of IR m and whose range is a subset of IR n . In other words, / takes a 
vector with m coordinates for input and returns a vector with n coordinates. For example, 
the function 

/(x, y : z) = (sin(x + y), 2x 2 + z) 

is a function from M 3 to M 2 . 

Definition We say a function L : IR m — > M. m is linear if (1) for any vectors x and y in 

L(x + y )=L(x) + L( y ), (1.5.1) 
and (2) for any vector x in IR m and scalar a, 

L(ax) = aL(x). (1.5.2) 

Example Suppose / : R — > R is defined by f(x) = 3x. Then for any x and y in R, 

f[ x + y) = 3(x + y) = 3x + 3y = f(x) + f(y), 

and for any scalar a, 

f(ax) = 3ax = af(x). 

Thus / is linear. 

Example Suppose L : R 2 -> M 3 is defined by 

L(xi : x 2 ) = (2xi + 3x 2 , x\ - x 2 , 4x 2 ). 

Then if x = (xi, x 2 ) and y = (2/1,2/2) are vectors in M 2 , 

L(x + y) = L{x\ + yi, x 2 + 2/2) 

= (2(xi + 2/1) + 3(x 2 + 2/2), xi + 2/1 - (x 2 + 2/2), 4(x 2 + 2/2)) 
= (2xi + 3x 2 , xi - x 2 , 4x 2 ) + (22/1 + 32/2, V\ - yzAVz) 
= L(xi,x 2 ) + L(yi,y 2 ) 
= L(x) + L(y). 

1 Copyright © by Dan Sloughter 2001 
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Also, for x = (x\, x 2 ) and any scalar a, we have 

L(ax) = L(axi,ax2) 

= (2axi + 3ax 2 , ax\ — ax 2 , 4ax 2 ) 
= a(2x 2 + 3x 2 , xi - X2, 4x 2 ) 
= aL(x). 

Thus L is linear. 

Now suppose L : R — >■ R is a linear function and let a = 1/(1). Then for any real 
number x, 

L(x) = L(lx) = xL(l) = ax. (1.5.3) 

Since any function L : R — > R defined by L(x) = ax, where a is a scalar, is linear (see 
Problem 1), it follows that the only functions L : R — >■ R which are linear are those of the 
form L(x) = ax for some real number a. For example, f(x) = 5x is a linear function, but 
g(x) = sin(x) is not. 

Next, suppose L : IR m — > R is linear and let a± = L(e\), a<i = L(e2), ■ ■ ■ , a m = L(e m ). 
If x = (xi, X2, • • • , x m ) is a vector in M m , then we know that 

x = xiei + x 2 e 2 H h x m e m . 

Thus 

L(x) = L(xiei + x 2 e 2 H h x m e m ) 

= L(xiei) + L(x 2 e 2 ) H h L(x m e m ) 

= xiL(ei + x 2 L(e 2 ) H hi m L(e m ) (1.5.4) 

= x\a\ + x 2 a 2 + ■ ■ ■ + x m a m 
= a ■ x, 

where a = (a±, a 2 , . . . , a m ). Since for any vector a in R m , the function L(x) = a-x is linear 
(see Problem 1), it follows that the only functions L : IR m — > R which are linear are those 
of the form L(x) = a • x for some fixed vector a in M m . For example, 

f(x, y) = (2, -3) • (x, y) = 2x- 3y 

is a linear function from M 2 to R, but 

f(x,y,z) = x 2 y + sin(z) 

is not a linear function from M 3 to R. 

Now consider the general case where L : M m — > M. n is a linear function. Given a vector 
x in M m , let Lfc(x) be the kth coordinate of L(x), fe = 1, 2, . . . , n. That is, 

L(x) = (L 1 (x),L 2 (x),...,L n (x)). 
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Since L is linear, for any x and y in R m we have 

L(x + y)=L(x) + L(y), 

or, in terms of the coordinate functions, 

(Lx(x + y), L 2 (x + y), . . . , L n (x + y)) = (L^x), L 2 (x), . . . , L n (x)) 

+ (L 1 (y),L 2 (y),...,L n (y)) 
= (L 1 (x)+L 1 (y),L 2 (x) + L 2 (y), 

. . . ,L„(x) + L n (y)). 

Hence Lfc(x + y) = Lfc(x) + Lk(y) for /c = 1, 2, . . . , n. Similarly, if x is in IR m and a is a 
scalar, then L(ax) = aL(x), so 

(Li(ax), L 2 (ax), . . . , L n (ax) = a(Li(x), L 2 (x), . . . , L n (x)) 

= (aLi(x), aL 2 (x), . . . , aL n (x)). 

Hence Lfc(ax) = aLfc(x) for = 1, 2, . . . , n. Thus for each k = 1, 2, . . . , n, Lfc : IR m — )■ R is 
a linear function. It follows from our work above that, for each k = 1, 2, . . . , n, there is a 
fixed vector in IR m such that L^{x) = ■ x for all x in IR m . Hence we have 

L(x) = (ai • x, a 2 • x, . . . , a n • x) (1.5.5) 

for all x in M. m . Since any function defined as in (1.5.5) is linear (see Problem 1 again), it 
follows that the only linear functions from IR m to M. n must be of this form. 

Theorem If L : IR m — > IR n is linear, then there exist vectors ai, a 2 , . . . , a n in IR m such 
that 

L(x) = (ai • x, a 2 ■ x, . . . , a n • x) (1.5.6) 

for all x in R m . 

Example In a previous example, we showed that the function L : IR 2 — > M 3 defined by 

L(xi, x 2 ) = (2xi + 3x 2 , x\ - x 2 , 4x 2 ) 
is linear. We can see this more easily now by noting that 

L(x!,x 2 ) = ((2,3) • (xi, x 2 ), (1, -1) • (x 1 ,x 2 ), (0,4) • (xi,x 2 )). 

Example The function 

f(x, y, z) = (x + y, sin(x + y + z)) 

is not linear since it cannot be written in the form of (1.5.6). In particular, the function 
f 2 (x, y, z) = sin(x + y + z) is not linear; from our work above, it follows that / is not linear. 
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Matrix notation 

We will now develop some notation to simplify working with expressions such as (1.5.6). 
First, we define an n x m matrix to be to be an array of real numbers with n rows and m 
columns. For example, 

~2 3^ 

\r = i -l 

4_ 

is a 3 x 2 matrix. Next, we will identify a vector x = (xi, X2, • • • , x m ) in IR m with the raxl 
matrix 

~ Xi 

X 2 



X n 

which is called a column vector. Now define the product Mx of an n x m matrix M with 
an m x 1 column vector x to be the n x 1 column vector whose kth entry, k = 1,2, . . . ,n, 
is the dot product of the kth row of M with x. For example, 



"2 


3" 




"2" 




"4 + 3" 




"7" 


1 


-1 








2-1 




1 





4 




1 




+ 4 




4 



In fact, for any vector x = (x\, Xq) in 



"2 


3" 








2x± + 3^2 


1 


-1 




X\ 




Xi - X 2 





4 




_x 2 _ 




4x 2 



In other words, if we let 

L(xi, x 2 ) = (2xi + 3x 2 , xi - x 2 , 4x 2 ), 
as in a previous example, then, using column vectors, we could write 



L(x 1 ,x 2 ) = 



-1 



X\ 

x 2 



In general, consider a linear function L : IR m — > R n defined by 

L(x) = (ai • x, a 2 • x, . . . , a n • x) 



(1.5.7) 



for some vectors ai , a2 , . • • , a n in IR m . If we let M be the n x m matrix whose kth row is 
a,t, k = 1, 2, . . . , n, then 

L(x) = Mx (1.5.8) 
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for any x in M. m . Now, from our work above, 

& k = (Lfc(ei), L fc (e 2 ), . . . , L fe (e m ), 
which means that the jth column of M is 



r^i(ej) 
L 2 (ej) 



(1.5.9) 



-L n (ej) 



(1.5.10) 



j = 1,2, ... ,m. But (1.5.10) is just L(ej) written as a column vector. Hence M is the 
matrix whose columns are given by the column vectors L(ei), L(e 2 ), . . . , L(e m ). 

Theorem Suppose L : IR m — > WL n is a linear function and M is the n x m matrix whose 
jih column is L(ej), j = 1,2, ... ,m. Then for any vector x in R m , 



L(x) = Mx. 

Example Suppose L : R 3 ->■ R 2 is defined by 

L(x, y, z) = (3a; - 2y + z, Ax + y). 



(1.5.11) 



Then 



and 



So if we let 



then 



L( ei ) = L(l,0,0) = (3,4), 
L(e 2 ) = L(0,l,0) = (-2,l), 

L(e 3 ) = L(0,0,l) = (l,0). 



M 



L{x,y,z) 



3 -2 1 

4 1 



3-2 1 
4 1 



For example, 



£(1,-1,3) = 



"3 -2 r 




1 

-1 

3 




"3 + 2 + 3" 




"8" 


4 10 






4-1 + 




3 
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Figure 1.5.1 Rotating a vector in the plane 



Example Let Re : M 2 — » IR 2 be the function that rotates a vector x in IR 2 counterclock- 
wise through an angle 9, as shown in Figure 1.5.1. Geometrically, it seems reasonable that 
Rq is a linear function; that is, rotating the vector x + y through an angle 9 should give 
the same result as first rotating x and y separately through an angle 9 and then adding, 
and rotating a vector ax through an angle 9 should give the same result as first rotating 
x through an angle 9 and then multiplying by a. Now, from the definition of cos(#) and 
sin(0), 

R e (e 1 ) = Rg(l,0) = (cos(0),sin(0)) 
(see Figure 1.5.2), and, since e 2 is ex rotated, counterclockwise, through an angle ^, 

Re(e 2 ) = Re+%(ei) = (cos (e + |) ,sin [d + |)) = (-sin(0),cos(0)). 



Re{x,y) 





X 




y. 



X5.12) 



Hence 

cos(6) — sm(9) 
sin(0) cos(0) 

You are asked in Problem 9 to verify that the linear function defined in (1.5.12) does in 
fact rotate vectors through an angle 9 in the counterclockwise direction. Note that, for 
example, when 9 



|, we have 



"0 


~r 




X 


1 







y. 



In particular, note that Rn(l,0) = (0,1) and 72^(0,1) = (—1,0); that is, Rk takes ei to 



For another example, if 9 = 


f , then 








1 " 






2 


~2 




X 


1 


Vs 




V 




- 2 


2 - 
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RJe{) = (cos(e), sin(6)) 




sin(6) 



cos(6) e i 
Figure 1.5.2 Rotating ei through an angle 9 



In particular, 



■>/3 


1 " 


2 


~2 


1 


73 


- 2 


2 - 



2 

1 

2 



1 



y/3-2 
2 

1 + 2y/3 



AfRne functions 

Definition We say a function A : ] 

R m _^ R n a vectQr b j n R n such ^ 



— >■ R is affine if there is a linear function L : 
A(x) = L(x) + b (1.5.13) 



for all x in 



An affine function is just a linear function plus a translation. From our knowledge of 
linear functions, it follows that if A : IR m — > IR n is affine, then there is an n x m matrix M 
and a vector b in IR n such that 

A(x) = Mx + b (1.5.14) 

for all x in IR m . In particular, if / : R — y R is affine, then there are real numbers m and b 
such that 

f(x) = mx + b (1.5.15) 

for all real numbers x. 
Example The function 



A(x, y) = (2x + 3, y - Ax + 1) 



2 


0~ 




X 




"3" 


-4 


1 




y. 


+ 


1 
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is an affine function from M. 2 to IR 2 since we may write it in the form 

A(x,y) = L(x,y) + (3,1), 

where L is the linear function 

L{x,y) = (2x,y- Ax). 
Note that L(l, 0) = (2, —4) and L(0, 1) = (0, 1), so we may also write A in the form 

A(x,y) 

Example The affine function 



A(x,y) 



first rotates a vector, counterclockwise, in R through an angle of f and then translates it 
by the vector (1, 2). 

Problems 

1. Let ai, a 2 , . . . , a n be vectors in M m and define L : IR m — > 



- 1 


i ■ 




"71 


i 


i 


.71 


71. 











+ 






2 



by 



L(x) = (ai • x, a 2 • x, . . . , a n • x). 



Show that L is linear. What does L look like in the special cases 

(a) m = n = 1? 

(b) ra = 1? 

(c) m = 1? 

2. For each of the following functions /, find the dimension of the domain space, the 
dimension of the range space, and state whether the function is linear, affine, or neither. 

(a) f(x, y) = (3x - y, Ax, x + y) (b) f(x, y) = (Ax + 7y, hxy) 

(c) f(x, y, z) = (3x + z,y - z,y - 2x) (d) f(x, y, z) = (3x - Az,x + y + 2z) 

1 



(e) f(x,y,z) = [3x + 5,y + z, 



x + y + z 



(f) f(x,y) = 3x + y-2 



(g) fix) = (x, 3x) (h) f(w, x, y, z) = (3x, w + x-y + z-b) 

(i) f(x, y) = (sin(x + y),x + y) (j) f(x, y) = (x 2 + y 2 ,x- y, x 2 - y 2 ) 

(k) f(x, y, z) = (3x + 5, y + z, 3x - z + 6, z - 1) 



Section 1.5 



Linear and AfEne Functions 



9 



3. For each of the following linear functions L, find a matrix M such that L(x) = Mx. 
(a) L(x, y) = (x + y, 2x - Sy) (b) L(w, x, y, z) = (x, y, z, w) 

(c) L(x) = (3x,x,4x) (d) L(x) = -5x 

(e) L(x, y, z) = Ax - 3y + 2z (f) L(x, y, z) = (x + y + z,3x - y,y + 2z) 

(g) L(x, y) = (2x, 3y, x + y, x - y, 2x - 3y) (h) L(x, y) = (x, y) 
(i) x, ?/, 2) = (2w + x — y + 3z, w + 2x — 3z) 

4. For each of the following affine functions A, find a matrix M and a vector b such that 
A(x) = Mx + b. 

(a) A(x, y) = (Sx + Ay - 6, 2x + y - 3) (b) A(x) = 3x - 4 
(c) A(x, y,z) = (3x + y-4,y-z + l,5) (d) x, y, z) = (1, 2, 3, 4) 

(e) y, 2) = 3x - Ay + z - 1 (f) A(x) = (3x, -x, 2) 

(g) A(xi, x 2 , x 3 ) = (xi - x 2 + 1, xi - x 3 + 1, x 2 + x 3 ) 



Multiply the following 

(a) 



2 
2 



(b) 



-1 
3 
-1 



(c) [1 2 1-3] 



6. Let L : 



2 
3 
-2 
1 

be the linear function that maps a vector x = (x, y) to its reflection 





"1 


2 


1" 




2" 


(d) 


3 


2 


3 




-1 







1 


2 




2 



across the horizontal axis. Find the matrix M such that L{x) = Mx for all x in 



l 2 — > M 2 be the linear function that maps a vector x 



(x,y) to its reflection 



x. Find the matrix M such that L(x) = Mx for all x in 



i 2 — > M 2 be the linear function that maps a vector x 



(x,y) to its reflection 
Mx for all x in R 2 . 



7. Let L 
across the line y 

8. Let L 

across the line y = —x. Find the matrix M such that L(x) 

9. Let Rg be defined as in (1.5.12). 

(a) Show that for any x in IR 2 , ||i2g(x)|| = ||x||. 

(b) For any x in M 2 , let a be the angle between x and Re{x)- Show that cos (a) = 
cos(0). Together with (a), this verifies that Rg(x) is the rotation of x through an 
angle 9. 



10. Let S e : 



be the linear function that rotates a vector x clockwise through an 



angle 9. Find the matrix M such that ^(x) = Mx for all x in 



11. Given a function / : 



->■ lR n , we call the set 
{y : y = /( x ) f° r some x in M m } 



the image, or range, of /. 
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(a) Suppose L : R — > R n is linear with L(l) 7^ 0. Show that the image of L is a line in 
R n which passes through 0. 

(b) Suppose L : M 2 — > M. n is linear and L(ei) and Liez) are linearly independent. 
Show that the image of L is a plane in IR n which passes through 0. 

12. Given a function / : R m ->• R, we call the set 

) : x m+1 = f(x 1 ,x 2 , ■ ■ .,x m )} 

the graph of /. Show that if L : IR m — > R is linear, then the graph of L is a hyperplane 
in R m+1 . 



The Calculus of Functions 
of 

Several Variables 



Section 1.6 

Operations with Matrices 



In the previous section we saw the important connection between linear functions and 
matrices. In this section we will discuss various operations on matrices which we will find 
useful in our later work with linear functions. 



The algebra of matrices 

If M is an n x m matrix with in the ith row and jth column, i = 1,2,. . . , n, j = 
1,2,..., TTi, then we will write M = [oy]. With this notation the definitions of addition, 
subtraction, and scalar multiplication for matrices are straightforward. 

are n x m matrices and c is a real number. 



Definition Suppose M 
Then we define 



and 



[ciij] and N 



M + N= [aij + bij], 
M-N= [aij-bij], 



cM 



\ca 



1.6.1) 
1.6.2) 

1.6.3) 



In other words, we define addition, subtraction, and scalar multiplication for matrices 
by performing these operations on the individual elements of the matrices, in a manner 
similar to the way we perform these operations on vectors. 



Example If 



and 



then, for example, 



M 



N 



2 
3 



1 

3 



M + N 



M-N 



1 + 3 


2 + 1 


3 + 4" 




4 


3 


7" 


5 + 1 


3-3 


-l + 2_ 




-4 





1 


-3 


2-1 


3-4" 




"-2 


1 


-1 


> - 1 


3 + 3 


-1-2 




-6 


6 


-3 



and 



3M 



3 
15 



6 
9 



9 
-2 



1 
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These operations have natural interpretations in terms of linear functions. Suppose 
L : R m ->■ R n and K : R m ->• M n are linear with L(x) = Mx and K(pc) = iVxfornxm 
matrices M and N. If we define L + K : R n ->■ IR m by 

(L + K)(x) = L(x)+K(x), (1.6.4) 

then 

(L + K)(e J ) = L(e,) + K(e J ) (1.6.5) 

for j = 1, 2, . . . , m. Hence the jth column of the matrix which represents L + K is the sum 
of the jth columns of M and N. In other words, 

(L + K)(x) = (M + iV)x (1.6.6) 

for all x in R m . Similarly, if we define L - K : R m ->■ M n by 

(L-K)(x) = L(x)-K(x), (1.6.7) 

then 

(L — X)(x) = (M — AT)x. (1.6.8) 
If, for any scalar c, we define cL : R m — > R n by 

cL(x) = c(L(x)), (1.6.9) 

then 

cL(ej) = c(L(ej)) (1.6.10) 

for j = 1,2, ... ,m. Hence the jth column of the matrix which represents cL is the scalar 
c times the jth column of M. That is, 

cL(x) = (cM)x (1.6.11) 

for all x in IR m . In short, the operations of addition, subtraction, and scalar multiplication 
for matrices corresponds in a natural way with the operations of addition, subtraction, 
and scalar multiplication for linear functions. 

Now consider the case where L : R m — > R p and K : R p — > R n are linear functions. Let 
M be the p x m matrix such that L(x) = Mx for all x in R m and let iV be the n x p 
matrix such that K(x) = Nx for all x in R p . Since for any x in R™, L(x) is in R p , we can 
form K o L : lR m -> M n , the composition of K with L, defined by 

KoL(x) = K(L(x.)). (1.6.12) 



Now 

K(L(x)) = iV(Mx), 



(1.6.13) 
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so it would be natural to define NM, the product of the matrices N and M, to be the 
matrix of K o L, in which case we would have 

iV(Mx) = (iVM)x. (1.6.14) 

Thus we want the jth column of NM, j = 1, 2, . . . , m, to be 

KoL(e J )=N(L(e J )), (1.6.15) 

which is just the dot product of L(ej) with the rows of N. But L(ej) is the jth column of 
M, so the jth column of NM is formed by taking the dot product of the jth column of M 
with the rows of N. In other words, the entry in the ith row and jth column of NM is the 
dot product of the ith row of N with the jth column of M. We write this out explicitly 
in the following definition. 

Definition If N = [a^] is an n x p matrix and M = [bij] is a p x m matrix, then we 
define the product of N and M to be the n x m matrix NM = [c^], where 

p 

Cij = ^2a ik b kj , (1.6.16) 

k=i 

i = 1, 2, . . . , n and j = 1, 2, . . . , m. 

Note that NM is an n x m matrix since K o L : R m — > R n . Moreover, the product 
NM of two matrices N and M is defined only when the number of columns of iV is equal 
to the number of rows of M. 

Example If 



and 



then 









1 


2 










N = 




-1 

2 - 


3 
2 














-2 1 
2 -1 




3" 

-2 


> 




NM = 


1 2" 

-1 3 

2 -2 




"2 -2 
1 2 




1 

-1 


3" 

-2 






2 + 2 




2 + 4 


1 


- 2 


3-4 




-2 + 3 




2 + 6 - 


-1 


-3 


-3-6 




4-2 




4-4 


2 + 2 


6 + 4 




"4 2 - 


-1 -1 " 












1 8 - 

2 -8 


-4 -9 
4 10 
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Note that N is 3 x 2, M is 2 x 4, and iVM is 3 x 4. Also, note that it is not possible to 
form the product in the other order. 

Example Let L : IR 2 — > IR 3 be the linear function defined by 

L(x, y) = (3x -2y,x + y, Ay) 
and let K : IR 3 — > M 2 be the linear function defined by 

K(x, y, z) = (2x - y + z, x - y - z). 

Then the matrix for L is 

M = 



3 -2 
1 1 
4 



the matrix for K is 



N = 



2-1 1 
1 -1 -1 



and the matrix for K o L : IR 2 — > M 2 is 



NM 



In other words, 



2-1 1 
1 -1 -1 



3 -2 
1 1 
4 



6- 


1 + 


-4-1 + 4" 




"5 -1" 


3- 


1 + 


-2-1-4 




2 -7 



KoL{x,y) 



"5 


-1" 




x 




5x — y 


2 


-7 




y . 




2x — 7y 



Note that it in this case it is possible to form the composition in the other order. The 
matrix for L o K : IR 3 



MN 



3 -2 
1 1 
4 



r is 

2-1 1 
1 -1 -1 



6-2 -3 + 2 3 + 2 
2 + 1 -1-1 1-1 
+ 4 -4 -4 



4-1 5 
3-2 
4 -4 -4 



and so 



L o K(x, y, z) = 



4-1 5 
3-2 
4 -4 -4 

















z 





Ax — y + 5z 

3x — 2y 
Ax — Ay — 4z 



In particular, note that not only is NM ^ MN, but in fact NM and MiV are not even 
the same size. 

Determinants 

The notion of the determinant of a matrix is closely related to the idea of area and volume. 
To begin our definition, consider the 2x2 matrix 



M = 



d\ Cl2 

bi b 2 
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and let a = (01,02) and b = (61,62)- If P is the parallelogram which has a and b for 
adjacent sides and A is the area of P (see Figure 1.6.1), then we saw in Section 1.3 that 



A = ||(ai,a 2 ,0) x (6i,6 2 ,0)|| = ||(0,0,ai6 2 - a 2 &i|| = |ai& 2 - a 2 h\. 

This motivates the following definition. 
Definition Given a 2 x 2 matrix 



(1.6.17) 



M = 



at a 2 
61 6 2 



the determinant of M, denoted det(M), is 

det(M) = ai6 2 — a 2 6i. 



(1.6.18) 



Hence we have A = | det(M)|. In words, for a 2 x 2 matrix M, the absolute value of 
the determinant of M equals the area of the parallelogram which has the rows of M for 
adjacent sides. 



Example We have 



det 



1 3 
-4 5 



l)(5)-(3)(-4) = 5 + 12 = 17. 



Now consider a 3 x 3 matrix 





ai 


a 2 


«3 


M = 


61 


62 


63 




Cl 


C 2 


C3 
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and let a = (01,02,03), b = (61,62,^3), and c = (01,02,03). If V is the volume of the 
parallelepiped P with adjacent edges a, b, and c, then, again from Section 1.3, 



V = |a- (b x c) 





ai(6 2 c 3 


- b 3 c 2 ) + a 2 (b 3 c 1 


- 61C3) + a 3 (6ic 2 - 


-&2Cl)| 




01 det 


h h 

C2 c 3 


— 02 det 


h b 3 
Cl c 3 


+ 03 det 


h b 2 
Cl c 2 





(1.6.19) 



Definition Given a 3 x 3 matrix 



M 



Ol 02 03 

bi b 2 b 3 
ci c 2 c 3 



the determinant of M, denoted det(M), is 
det(M) = ai det 



'b 2 


b 3 ~ 


— 02 det 


h 


bs~ 


+ 03 det 


h 


b 2 


_c 2 


C 3. 


Cl 


C 3. 


Cl 


c 2 _ 



(1.6.20) 



Similar to the 2x2 case, we have V = \ det(M)|. 
Example We have 



det 



2 3 9 
2 1 -4 
5 1 -1 



2 det 



1 -4 
1 -1 



- 3 det 



2 -4 
5 -1 



+ 9 det 



2(-l + 4) - 3(-2 + 20) + 9(2 - 5) 

6-54-27 

-75. 



2 1 
5 1 



Given annxn matrix M — [o^], let My be the (n — 1) x (n — 1) matrix obtained by 
deleting the ith row and jth column of M. If for n = 1 we first define det(M) = an (that 
is, the determinant of a 1 x 1 matrix is just the value of its single entry), then we could 
express, for n = 2, the definition of a the determinant of a 2 x 2 matrix given in (1.6.18) 
in the form 

det(M) = an det(Mu) — ai 2 det(M 12 ) = aua 22 — a\ 2 a 2 \. (1.6.21) 

Similarly, with n = 3, we could express the definition of the determinant of M given in 
(1.6.20) in the form 

det(M) = on det (Mn) - 012 det(M i2 ) + a 13 det(Mi 3 ). (1.6.22) 

Following this pattern, we may form a recursive definition for the determinant of an n x n 
matrix. 
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Definition Suppose M = [a^-] is an nxn matrix and let My be the (n— 1) x (n— 1) matrix 
obtained by deleting the ith row and jth column of M, i = 1,2, . . . ,n and j = 1,2, ... ,n. 
For n = 1, we define the determinant of M, denoted det(M), by 



det(M) = an. 

For n > 1, we define the determinant of M, denoted det(M), by 

det(M) = a n det(M n ) - a i2 det(Mi 2 ) + • • • + (-1 ) 1+n ai n det(M in ) 

n 

= ^(-l) 1 ^a lj det(M lj ). 



(1.6.23) 



(1.6.24) 



We call the definition recursive because we have defined the determinant of an n x n 
matrix in terms of the determinants of (n — 1) x {n — 1) matrices, which in turn are defined 
in terms of the determinants of (n — 2) x (n — 2) matrices, and so on, until we have reduced 
the problem to computing the determinants of 1 x 1 matrices. 

Example For an example of the determinant of a 4 x 4 matrix, we have 



det 



2 
2 
-2 
1 



1 
1 

3 
2 



3 
4 
-1 
1 





"1 


4 


1" 






2 


4 




1" 




2 det 


3 


-1 


2 




- det 


-2 


-1 




2 






2 


1 


1 






1 


11 




1 








2 


1 


1 






2 


1 




4 


+ 3 det 


-2 


3 


2 


- 2 det 


-2 


3 




-1 






1 


2 


1 






1 


2 




1 



2((-l - 2) - 4(3 - 4) + (3 + 2)) - (2(-l - 2) 
- 4(-2 - 2) + (-2 + 1)) + 3(2(3 - 4) - (-2 - 2) 
+ (-4 - 3)) - 2(2(3 + 2) - (-2 + 1) + 4(-4 - 3)) 

2(-3 + 4 + 5) - (-6 + 16 - 1) + 3(-2 + 4-7) 
-2(10 + 1-28) 

12-9-15 + 34 

22. 



The next theorem states that there is nothing special about using the first row of the 
matrix in the expansion of the determinant specified in (1.6.24), nor is there anything 
special about expanding along a row instead of a column. The practical effect is that we 
may compute the determinant of a given matrix expanding along whichever row or column 
is most convenient. The proof of this theorem would take us too far afield at this point, 
so we will omit it (but you will be asked to verify the theorem for the special cases n = 2 
and n = 3 in Problem 10). 
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Theorem Let M = [a^] be an n x n matrix and let be the (n — 1) x (n — 1) matrix 
obtained by deleting the ith row and jth column of M. Then for any i = 1, 2, . . . , n, 



det(M) = ^(-l) l+J a lJ det(M ii ), 



(1.6.25) 



and for any j = 1, 2, . . . , n, 



det(M) = ^(-l) l+J a lJ det(M ii ), 

i=i 



Example The simplest way to compute the determinant of the matrix 



(1.6.26) 



M 



4 
2 
-3 



3 
3 1 
-2 



is to expand along the second column. Namely, 



2 1 

-3 -2 

4 3 
2 1 



+ (-l) 2+2 (3) det 



4 3 
-3 -2 



det(M) = (-l) 1+2 (0) det 

+ (-l) 3+2 (0) det 

= 3(-8 + 9) 
= 3. 



You should verify that expanding along the first row, as we did in the definition of the 
determinant, gives the same result. 

In order to return to the problem of computing volumes, we need to define a paral- 
lelepiped in M. n . First note that if P is a parallelogram in M 2 with adjacent sides given by 
the vectors a and b, then 



P = { y : y = ta. + sb, < t < 1, < s < 1}. 



(1.6.27) 



That is, for < t < 1, ta is a point between and a, and for < s < 1, sb is a point 
between and b; hence ta + sb is a point in the parallelogram P. Moreover, every point 
in P may be expressed in this form. See Figure 1.6.2. The following definition generalizes 
this characterization of parallelograms. 

Definition Let a 1? a 2 , . . . , a n be linearly independent vectors in W 1 . We call 



P = {y : y = han + h&2 H h t n a n , < t t < 1, i = 1, 2, 

an n- dimensional parallelepiped with adjacent edges ai, a 2 , . . . , a n . 



n} 



(1.6.28) 
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Definition Let P be an n-dimensional parallelepiped with adjacent edges ai, a2, . . . ,a n 
and let M be the n x n matrix which has ai, a 2 , . . ■ , a n for its rows. Then the volume of 
P is defined to be | det(M)|. 

It may be shown, using (1.6.26) and induction, that if iV is the matrix obtained by 
interchanging the rows and columns of an n x n matrix M, then det(iV) = det(M) (see 
Problem 12). Thus we could have defined M in the previous definition using ai, a2, . . . , a n 
for columns rather than rows. 

Now suppose L : IR n — > M n is linear and let M be the n x n matrix such that L(x) = 
Mx for all x in M. n . Let C be the n-dimensional parallelepiped with adjacent edges 
ei, e 2 , ■ ■ ■ i e nj the standard basis vectors for W 1 . Then C is a 1 x 1 square when n = 2 
and a 1 x 1 x 1 cube when n = 3. In general, we may think of C as an n-dimensional unit 
cube. Note that the volume of C is, by definition, 





"1 





• 


• 0" 







1 


• 


• 


det 








1 ■ 


■ 




.0 





■ 


■ 1 



Suppose L(ei), L(e2), ■ ■ ■ , L(e n ) are linearly independent and let P be the n-dimensional 
parallelepiped with adjacent edges L(ei), L(e2), ■ ■ ■ , L(e n ). Note that if 

x = tiei + t 2 e 2 H h t n e n , 

where < t\~ < 1 for k = 1, 2, . . . ,n, is a point in C, then 



L(x) = tiL(ei) + t 2 L(e 2 ) + ■■■+ t n L(e n ) 
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is a point in P. In fact, L maps the n-dimensional unit cube C exactly onto the n- 
dimensional parallelepiped P. Since L(ei), L(e2), . . . , L(e n ) are the columns of M, it 
follows that the volume of P equals |det(M)|. In other words, |det(M)| measures how 
much L stretches or shrinks the volume of a unit cube. 



Theorem Suppose L 



is linear and M is the n x n matrix such that 



L(x) = Mx. If L(ei), L(e2), ■ ■ ■ , L(e n ) are linear independent and P is the n-dimensional 
parallelepiped with adjacent edges L(e\), L(e 2 ), . . . , L(e„), then the volume of P is equal 
to |det(M)|. 



Problems 





2 


3" 




"3 


-2" 


1. Let M = 


-2 


1 


and N = 


1 







4 


-1 




2 


-5 



(a) 3M 

(c) 2M + N 

2. Evaluate the following matrix products. 



Evaluate the following. 

(b) M-N 
(d) 2N-6M 



3 2 
-1 1 



2 1 3 
-3 2 1 



(a) 

(c) l _ 
3. Suppose L : R 3 

and 



2 -3 
1 4 



4 -1 
2 4 
1 -2 



(b) 

(d) [1 2 3 -1 



1 4 

2 -2 



" 2 


1- 


3 


1 


-2 


4 


. 


-4. 



i 3 and K : R 3 ->• M 3 are defined by 
L(x, y, z) = (2x + 3y, y - x + 2z, x + 2y - z) 



K(x, y, z) = (2x + Ay - 3z, x + y + z, 3x - y + Az). 

Find the matrices for the following linear functions, 
(a) 3L (b) L + K 

(c) 2L-K (d) K + 2L 

(e) KoL (f) LoK 

A. Let Rq : M 2 — >■ M 2 be the linear function which rotates a vector in IR 2 counterclockwise 
through an angle 9. In Section 1.5 we saw that 

Re(x,y) = 



cos(0) 


-sm(6)~ 






sin(0) 


cos(^) 







Show that the matrix for Rg o i? Q is the same as the matrix for Rg+ a . In other words, 
show that Rq o R a = Rq +c ,. 
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5. Compute the determinants of the following matrices. 

(a) 

(c) 



2 
1 


3 
4 








(b) 


-3 
1 


-2 
2 








"2 




3 


1" 






"-1 


2 


-1" 






1 




2 


9 




(d) 


3 


1 









_5 




3 


-1 






5 


-4 


0_ 






-1 




2 


-1 


3" 
1 
3 
1. 




" 1 




2 
2 


-2 



3 
2 


1- 




4 
1 
.1 




3 
4 


-2 
-4 


(f) 


-3 
1 


2 
5 




-2 


1 
1 


5 





3 


3 




. 6 


-5 





2 


-4. 



(e) 



6. Find the area of the parallelogram in M 2 with vertices at (1, —2), (3, —1), (4, 1), and 
(2,0). 

7. Find the volume of the parallelepiped in IR 3 with bottom vertices at (1, 1, 1), (2, 3, 2), 
(-1,4,3), and (-2,2,2) and top vertices at (1,0,5), (2,2,6), (-1,3,7), and (-2,1,6). 

8. Let P be the 4-dimensional parallelepiped with adjacent edges ai = (2, 1, 2, 1), a2 = 
(-2, 0, 1, 1), a 3 = (1, 1, 3, 6), and a 4 = (-3, 1, 5, 0). Find the volume of P. 

9. Find 2x2 matrices A and B for which AB ^ BA. 

10. Verify that (1.6.25) and (1.6.26) hold for all 2 x 2 and 3 x 3 matrices. 

11. An n x n matrix M = [a^] is called a diagonal matrix if a^- = for all i ^ j. Show 
that if M is a diagonal matrix, then det(M) = a\\ayi • • • a nn . 

12. If M is an n x m matrix, then the m x n matrix M T whose columns are the rows of 
M is called the transpose of M. For example, if 



M = 



then 



M 1 



3 
4 



(a) Show that for a 2 x 2 matrix M, det(M T ) = det(M). 

(b) Show that for a 3 x 3 matrix M, det(M T ) = det(M). (Hint: Using (1.6.26), expand 
det(M) along the first row and det(M ) along the first column.) 

(c) Use induction to show that for any n x n matrix M, det(M T ) = det(M). (Hint: 
Note that (M T )ij = (Mji) T .) 

13. Let x = (xi,X2,xs) and y = (yi, yziVs) be vectors in M 3 and let ei, e 2 , and e 3 be the 
standard basis vectors for M 3 . Show that applying (1.6.20) to the array 



ei 

Xi 

Vi 



e 2 

x 2 

V2 



e 3 

x 3 

2/3 
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yields x x y. Discuss what is correct and what is incorrect about the statement 



x x y = det 



ei e 2 e 3 

X\ x 2 x 3 
Vi 2/2 2/3 



14. Show that the set of all points x = (x, y, z) in M 3 which satisfy the equation 



det 



x 
1 
3 



y 

2 
1 



= 



is a plane passing through the points (0, 0, 0), (1, 2, —1), and (3, 1, 2). 



15. Verify directly that if L 



and K 



— > IR n are linear functions, then 



KoL 



— > R n is also a linear function. 



f \ 

The Calculus of Functions 
of 

Several Variables 



Section 2.1 
Curves 



Now that we have a basic understanding of the geometry of IR n , we are in a position 
to start the study of calculus of more than one variable. We will break our study into 
three pieces. In this chapter we will consider functions / : R — > M n , in Chapter 3 we will 
study functions / : IR n — > R, and finally in Chapter 4 we will consider the general case of 
functions / : R m R n . 

Parametrizations of curves 

We begin with some terminology and notation. Given a function / : R — > M. n , let 

f k (t) = fcth coordinate of f(t) (2.1.1) 

for k = 1, 2, . . . , n. We call fk : R — > R the kth coordinate function of /. Note that fk has 
the same domain as / and that, for any point t in the domain of /, 

f(t) = (f l (t)J 2 (t),...J n (t). (2.1.2) 

If the domain of / is an interval /, then the range of /, that is, the set 

C = {x : x = f(t) for some t in /}, (2.1.3) 

is called a curve with parametrization f. The equation x = f(t), where x is in M n , is a 
vector equation for C and, writing x = (xi, X2, . . . , x n ), the equations 



xi = hit), 
X2 = h{t), 

Xn fn (t) i 



(2.1.4) 



are parametric equations for C. 

Example Consider / : R — > M 2 defined by 

f(t) = (cos(t),sin(t)) 

for < t < 2n. Then for every value of t, f(t) is a point on the circle C of radius 1 with 
center at (0,0). Note that /(0) = (1,0), / (f) = (0,1), f(n) = (-1,0), / (f ) = (0,-1), 

1 Copyright © by Dan Sloughter 2001 
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Figure 2.1.1 f(t) = (cos (t), sin (t) 

and /(27r) = (1,0) = /(0). In fact, as t goes from to 2-7T, /(£) traverses C exactly once 
in the counterclockwise direction. Thus / is a parametrization of the unit circle C. If we 
denote a point in M 2 by (x,y), then 

£ = COs(t), 

y = sin(i), 

are parametric equations for C. See Figure 2.1.1. The coordinate functions are 

/ 1 (t) = cos(t), 
/ 2 (£) = sin(t), 

although we frequently write these as simply 

x{t) = cos(t), 
y(t) = sin(t). 

Example Consider g : K — > M 2 defined by 

g(t) = (sin(27rt),cos(27rt)) 

for < t < 2. Then g also parametrizes the unit circle C centered at the origin, the same 
as / in the previous example. However, there is a difference: g(0) = (0, 1), g (j) = (1,0), 
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b 



-b 



Figure 2.1.2 The ellipse + fa = 1 



# (|) = (0, —1), # (|) = (—1, 0), and g(l) = (0, 1) = g(0), at which point g starts to repeat 
its values. Hence g(t), starting at (0, 1), traverses C twice in the clockwise direction as t 
goes from to 2. 

Example More generally, suppose a, 5, and a are real numbers, with a > 0, b > 0, and 
a/0, and let 

x(t) = a cos (at), 
y(t) = 6sin(at). 



Then 



(x(t)) 2 (y(t)) 2 2 , , . 2 , , , 

+ ; o = cos 2 (at) + sin 2 (at) = 1, 



b 2 



so (x(t),y(t)) is a point on the ellipse E with equation 



x 2 y 2 

a 2 + 6 2 ' 

shown in Figure 2.1.2. Thus the function 

/(t) = (a cos(at), 6cos(at)) 

parametrizes the ellipse E, traversing the complete ellipse as t goes from to | ^ | • 
Example Define / : R ->■ M 2 by 

/(t) = (tcos(t),tsin(t)) 

for — oo < t < oo. Then for negative values of t, /(t) spirals into the origin as t in- 
creases, while for positive values of t, f(t) spirals away from the origin. Part of this curve 
parametrized by / is shown in Figure 2.1.3. 
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Figure 2.1.3 The spiral f(t) = (t cos(i), t sin(t)) for -4tt < t < 4tt 



Example Define / : R — > R by 



/(t) = (3-4t,2 + 3t) 



for — oo < t < oo. Then 



/(t)=t(-4,3) + (3,2), 

so / is a parametrization of the line through the point (3, 2) in the direction of (—4, 3). 

In general, a function / : R — > IR n defined by /(£) = tv + p, where v ^ and p are 
vectors in M. n , parametrizes a line in IR n . 

Example Suppose g : R — > R 3 is defined by 

g(t) = (4cos(t),4sin(t),t) 

for — oo < t < oo. If we denote the coordinate functions by 

x(t) = 4 cos(t), 
y(t) = 4sin(t), 
z(t) = t, 



then 



{x(t)) 2 + (y(t)) 2 = 16 cos 2 (t) + 16sin 2 (t) = 16. 



Hence g(t) always lies on a cylinder of radius 1 centered about the z-axis. As t increases, 
g(t) rises steadily as it winds around this cylinder, completing one trip around the cylinder 
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X 




Figure 2.1.4 The helix f(t) = (4 cos(t), 4 sin(t), t), -2tt < t < 2tv 

over every interval of length 2tc. In other words, g parametrizes a helix, part of which is 
shown in Figure 2.1.4. 

Limits in R n 

As was the case in one-variable calculus, limits are fundamental for understanding ideas 
such as continuity and differentiability. We begin with the definition of the limit of a 
sequence of points in IR m . 

Definition Let {x n } be a sequence of points in IR m . We say that the limit of {x n } as n 
approaches infinity is a, written lim x n = a, if for every e > there is a positive integer 

n— ¥oo 

N such that 

||x„ -a|| < e (2.1.5) 

whenever n > N. 

Notice that this definition involves only a slight modification of the definition for the 
limit of a sequence of real numbers, namely, the use of the norm of a vector instead of the 
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2 



1.5 



0.5 



0.2 0.4 0.6 0.8 1 

Figure 2.1.5 Points (l — 1, ^) approaching (1,0) 



absolute value of a real number in (2.1.5). In words, lim x„ = a if, given any e > , we 

n— >-oo 

can always find a point in the sequence beyond which all terms of the sequence lie within 
B n (a, e), the open ball of radius e centered at a. 

Example Suppose 



1 - 



1 2 



n n 



for n = 1,2, 3, 



and 



we should have 



Since 



lim 1 

n— >oo \ n 



lim 



n— >-oo n 



lim x n = (1,0). 



To verify this, we first note that 

K- (i>o)|| 



(A 


-) 


\ n 


n J 



n 2 ri 2 



75 
n 



Hence ||x n — (1, 0) || < e whenever n > ^. That is, if we let N be any integer greater than 
or equal to — , then ||x n — (1, 0)|| < e whenever n > N, verifying that 



lim x n = (1,0). 



See Figure 2.1.5. 



Put another way, the definition of the limit of a sequence in IR m says that a sequence 
{x n } in IR m converges to a in M m if and only if the sequence of real numbers {||x n — a||} 
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converges to 0. That is, lim x n = a if and only if lim ||x n — a|| = 0. Moreover, if we let 
x n = (x nl ,x n2 , ■ ■ .,x nm ) and a = (ai,a 2 , . . . ,a m ), then 

||x n -a\\ = y/ (x nl - ai) 2 + (x n2 - a 2 ) 2 H h (x nm - a m ) 2 , (2.1.6) 

so lim ||x n — a|| = if and only if 

n— >oo 

lim a/ (x ni - ai) 2 + (x n2 - a 2 ) 2 H h (x nm - a m ) 2 = 0. (2.1.7) 

But (2.1.7) can occur only when lim (x n k — dk) 2 = for k = 1, 2, . . . , m. Hence lim x r 
a if and only if lim x n fc = for /c = 1, 2, . . . , m. 



<-n 

n— »oo 



Proposition Suppose {x n } is a sequence in IR m , x n = (x n i, x n 2, ■ ■ ■ , x nm ), and a = 
(ai, a 2 , . . . , a m ). Then lim x n = a if and only if lim x n k = dk for k = 1, 2, . . . , m. 

This proposition tells us that to compute the limit of a sequence in IR m , we need only 
compute the limit of each coordinate separately, thus reducing the problem of computing 
limits in IR m to the problem of finding limits of sequences of real numbers. 

Example If 

'2-n fl\ /3' 



— , sm - , cos 
n z \ n ) \n 



n = 1, 2, 3, . . ., then 



lim x n = ( lim — lim sin ( — J , lim cos ( — ) ) = (0, 0, 1). 



n— S-oo \ n— >-oo fl^ n^-oo \ ft I n^roo \ fl 

We may now define the limit of a function / : R — > IR m at a real number c. Notice that 
the definition is identical to the definition of a limit for a real-valued function / : R — >■ R. 

Definition Let c be a real number, let I be an open interval containing c, and let 
J = {t : t is in /, t ^ c}. Suppose / : R — > lR m is defined for all t in J. Then we say that 
the limit of f(t) as t approaches c is a, denoted lim/(t) = a, if for every sequence of real 

t— >c 

numbers {t n } in J, 

lim f(t n ) = a (2.1.8) 

whenever lim t n = c. 

n— >oo 

As in one- variable calculus, we may define the limit of fit) as t approaches c from the 
right, denoted 

lim /(t), 

t— >-c+ 

by restricting to sequences {t n } with t n > c for n = 1, 2, 3, . . ., and the limit of fit) as t 
approaches c from the left, denoted 



lim /(*), 
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by restricting to sequences {t n } with t n < c for n = 1,2,3, — Moreover, the following 
useful proposition follows immediately from our definition and the previous proposition. 

Proposition Suppose / : R ->■ lR m with 

f(t) = (h(t)J 2 (t),...J m (t)). 

The for any real number c, 

Jim /(*) = (Jim Jim f 2 (t), . . . , Jim f m (t)). (2.1.9) 

Hence the problem of computing limits for functions / : R — > lR m reduces to the 
problem of computing limits of the coordinate functions ft : R — > R, k = 1, 2, . . . , m, a 
familiar problem from one- variable calculus. The analogous statements for limits from the 
right and left also hold. 

Example If f(t) = (t 2 — 1, sin(£), cos(f)) is a function from R to M 3 , then, for example, 
lim f(t) = ( lim(t 2 - 1), lim sin(t), lim cos(tj) = (n 2 - 1, 0, -1). 

Definitions for continuity also follow the pattern of the related definitions in one- 
variable calculus. 

Definition Suppose / : R — > lR m . We say / is continuous at a point c if 

Jim/(t) = /(c). (2.1.10) 

We say / is continuous from the right at c if 

lim /(f) = /(c) (2.1.11) 

and continuous from the left at c if 

lim fit) = /(c). (2.1.12) 

t— >-c- 

We say / is continuous on an open interval (a, b) if / is continuous at every point c in 
(a, b) and we say / is continuous on a closed interval [a, b] if / is continuous on the open 
interval (a, 6), continuous from the right at a, and continuous from the left at b. 

If /(£) = (fi(t), /2(f), ... , fm(t)), then / is continuous at a point c if and only if 
Jim / (t) = ( Jim A (t) dim f 2 (t) , . . . lim f m (t) = f (c) = (h (c) , f 2 (c) , . . . , f m (c)) , 

which is true if and only if lim fk(t) = fk(c) for k = 1, 2, . . . , m. In other words, we have 

t— >c 

the following useful proposition. 
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Proposition A function / : R — >• IR m with /(f) = (fi(t), / 2 (f), . . . , f m (t)) is continuous 
at a point c if and only if the coordinate functions fi, f2, ■ ■ ■ , fm are each continuous at c. 

Similar statements hold for continuity from the right and from the left. 

Example The function / : R R 3 defined by 



is continuous on the interval (—00, 00) since each of its coordinate functions is continuous 



1. Plot the curves parametrized by the following functions over the specified intervals /. 



(a) /(f) 


= (3f+l,2f-l),/= [- 


-5,5] 


(b) g(t) 


= (t,t 2 ),I=[-3,3] 




(c) /(f) 


= (3cos(f),3sin(f)), / = 


[0,2tt] 


(d) h(t) 


= (3cos(f),3sin(f)), / = 


[0,tt] 


(e) fit) 


= (4cos(2f),2sin(2f), / : 


= [0, tt] 


(f) g(t) 


= (-4cos(f),2sin(f)), / 


= [0, 7V] 


(g) Mf) 


= (fsin(3f),fcos(3f)), / 


= [-7T,7t[ 



2. Plot the curves parametrized by the following functions over the specified intervals /. 

(a) /(f) = (f+l,2f-l,3f),/=[-4,4] 

(b) git) = (cos(f),f,sin(f)), /= [0,4tt] 

(c) /(£) = (fcos(2f),fsin(2f),f), /= [-10,10] 

(d) hit) = (cos(2f),sin(2f), y/i), 1= [0,9] 



3. Plot the curves parametrized by the following functions over the specified intervals /. 



(a) /(£) = (cos(47rf),sin(57rf)), /= [-0.5,0.5] 

(b) /(f) = (cos(67rf),sin(77rf)), /= [-0.5,0.5] 

(c) hit) = (cos 3 (f),sin 3 (f)), / = [0, 2n] 

(d) git) = (cos(27rf),sin(27rf),sin(47rf)), / = [0,1] 

(e) /(£) = (sin(4f) cos(f), sin(4f) sin(f)), / = [0, 2tt] 

(f) hit) = ((l + 2cos(f))cos(f),(l + 2cos(f))sin(f)), I = [0, 2tt] 

4. Suppose g : R R and we define / : R -> M 2 by /(f) = it, git)). Describe the curve 
parametrized by /. 

5. For each of the following, compute lim x n . 



/(f) = (sin(f 2 ),f 3 + 4,cos(f)) 



on (—00, 00). 



Problems 
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{In -I 3n + 4 ( 6 6n + 1 

(c) x n = — , 4 



n 2 + 1 ' n + 1 n 2 ' 2n 2 + 5 
6. Let / : R — >■ E 3 be defined by 



( Sin ( (/1 .n.s(/).:^ 



Evaluate the following. 

(a) Km /(f) (b) Jim /(f) 

(c) Km /(f) 

7. Discuss the continuity of each of the following functions. 

(a) /(f) = (f 2 + l,cos(2f),sin(3f) (b) g(t) = (v / H r T, tan(f)) 

(c) /(f) = x/l-t 2 , ^ (d) </(*) = (cos(4f), 1 - v / 37+T,sin(5f),sec(f)) 

8. Let / : R — >• R 3 be defined by /(f) = (f 2 , 3f, 2* + 1). Find 

lim /(« + ft)-/M, 

7i-s>0 /l 



The Calculus of Functions 
of 

Several Variables 
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Best Affine Approximations 



In this section we will generalize the basic ideas of the differential calculus of functions 
/ : E -> E to functions / : E -> R n . Recall that given a function / : E ->■ E, we 
say / is differentiable at a point c if there exists an affine function A : E — > E, A(x) = 
m(x — c) + /(c), such that 



1™ /(c+, ' ) :- 4 ( c+, ' ) = 0. 



/i->0 



/l 



(2.2.1) 



We call A the 6es£ affine approximation to / at c and m the derivative of / at c, denoted 
/'(c). Moreover, we call the the graph of A, that is, the line with equation 



y = f(c)(x-c) + f(c), 



(2.2.2) 



the tangent line to the graph of / at (c, /(c)). 

The condition (2.2.1) says that the function <p(h) = /(c + h) — A{c + h) is o(h). In 
general, we say a function ip : E — > E is o(h) if 



lim = 0. 

h 



(2.2.3) 



Best affine approximations 

Generalizing the idea of the best affine approximation to the case of a function / : E — > MJ 1 
requires only a slight modification of the requirement that /(c + h) — A(c + h) be o(h). 
Namely, since f(c+h)—A(c+h) is a vector in E n , we will require that \\f(c+h) — A(c+h)\\, 
instead of f(c+h)—A(c+h), be o(h). If n = 1, this will reduce to the one- variable definition 
since, in that case, \\f(c + h) — A(c + /i)|| = \ f(c + h) — A(c + h)\ and a function (p : E — > E 
is o(h) if and only if |<£>(/i)| is o(h). 

Definition Suppose / : E — > E n and c is a point in the domain of /. We call an affine 
function A : E — > M. n the best affine approximation to / at c if (1) A(c) = /(c) and (2) 
||i?(/i)|| is o(h), where 

R(h) = f(c+h)-A(c+h). (2.2.4) 

Suppose / : E -)■ R n and A : E -> R n is an affine function for which A(c) = /(c). 
Since A is affine, there exists a linear function L : E — > M. n and a vector b in IR n such that 
A(t) = L(f) + b for all t in E. Since we have 

f(c)=A(c)=L(c)+b, (2.2.5) 



1 
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it follows that b = f(c) — L(c). Hence, for all t in R, 

A(t) = Lit) + /(c) - L(c) = L(c -t) + /(c). 
Moreover, if a = L(l), then, from our results in Section 1.5, 

A(t) = a(t-c) + f(c). 

Hence 

R(h) = f( c + h)- A(c + h) = f(c + h)- /(c) - ah, 
from which it follows that 

lim \\R(h)\\ = lim \\f(c + h)-f(c)- a h\\ 
h^o+ h 



h^o+ h 

f(c + h)-f(c)-zh 



= lim 

h-*o+ 

= lim 



h 

f(c + h)-f(c) 

h 



— a 



Thus 



if and only if 



lim ™ = 

h^o+ h 



/i->0+ /l 

A similar calculation from the left shows that 



h^o- h 



if and only if 



Hence 



if and only if 



lim fj£±RzM = a . 
h^o- h 



h^O h 



Um MWM = a . 

/i-»0 ft 



That is, A is the best affine approximation to / at c if and only if, for all t 

A(t) = a (t-c) + f(c), 
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where 

.= ton * c+ *■>-■«'>■ (2.2.13) 

h->0 h 



Definition Suppose / : R — >■ IR n . If 



lim /(C + ",'- /M (2.2.14) 



fr^o h 

exists, then we say / is differentiable at c and we call 

Df(c) = lim f±±lpM (2.2.15) 

the derivative of / at c. 

Note that (2.2.15) is the same as the formula for the derivative in one-variable calcu- 
lus. In fact, in the case n = 1, (2.2.15) is just the derivative from one-variable calculus. 
However, if n > 1, then Df(c) will be a vector, not a scalar. 

The following theorem summarizes our work above. 

Theorem Suppose / : R — > M. n and c is a point in the domain of /. Then / has a best 
affine approximation A : R — > M. n at c if and only if / is differentiable at c, in which case 

A(t) = Df(c)(t-c) + f(c). (2.2.16) 

We saw in Section 2.1 that a limit of a vector- valued function / may be computed by 
evaluating the limit of each coordinate function separately. This result has an important 
consequence for computing derivatives. Suppose / : R — > M. n is differentiable at c. If we 
write 

/(*) = (/i(*), /*(*), •••,/»(*), 

then 

Df(c) = lim f±±ipm 

= lim \{{h{ C + h)J 2 (c + h),...J n (c+h)- (/x(c), /2(C), • • • , / n (c)) 

_ ( h{c+h)-h{ C ) f 2 ( C + h)-f 2 (c) fn(c + h)-f n ( C ) 

h™\ h ' /l'""' /l 

lim A(c + M-/i(c) ^ f 2 ( c + h)-f 2 (c) nm /,(c + ^)-/ n (c) 

/i^O /l ' h->0 h ' /l 

= (/(( c ),^(c),...,/;(c)). 

In words, the derivative of / is the vector whose coordinates are the derivatives of the 
coordinate functions of /, reducing the problem of differentiating vector-valued functions 
to the problem of differentiation in single-variable calculus. 
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Proposition If / is differentiable at c and f(t) = (fi(t), f2(t), ■ ■ ■ , f n (to)), then each 
coordinate function k = 1, 2, . . . , n, is differentiable at c and 

Df{c) = {f[{c)J> 2 {c),...J> n {c)). (2.2.17) 
For an arbitrary point t at which / is differentiable, we will write, 

Df(t) = Jim f{t + h) ~ f{t) = (f[(t), m, . . . , f n {t)). (2.2.18) 

That is, we may think of Df as a vector-valued function itself, with domain being the set 
of points at which / is differentiable. 

Now suppose / : R — > IR n parametrizes a curve C and is differentiable at c. If Df(c) ^ 
0, then the best affine approximation 

A(t)=Df(c)(t-c) + f(c) 

parametrizes a line, a line which best approximates the curve C for points near f(c). On 
the other hand, if Df(c) = 0, then A is a constant function with range consisting of the 
single point /(c). These considerations motivate, in part, the following definitions. 

Definition Suppose / : R — > IR n is differentiable on (a, b) and x = f(t) is a parametriza- 
tion of a curve C for a < t < b. If Df(t) is continuous and Df{t) ^ for all t in (a, b), 
then we call / a smooth parametrization of C. 

Definition Suppose / : R — > R n parametrizes a curve C in IR n and let A be the best 
affine approximation to / at c. If / is smooth on some open interval containing c, then we 
call the line in IR n parametrized by A the tangent line to C at /(c). 

Example Define / : R ->■ IR n by /(£) = (cos(t), sin(t)) for -oo < t < oo. Then, as we 
saw in Section 2.1, / parametrizes the unit circle C centered at the origin. Now 

£>/(*) = (-sin(t),cos(t)), 

so Df(t) is continuous and = 1 for all t. Thus / is a smooth parametrization of 

C. For example, 



and 

I) 




so the best affine approximation to / at t = | is 

A(t) = 




Figure 2.2.1 shows C along with the tangent line to C at t = |. 
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Figure 2.2.1 Unit circle with tangent line at ( 2 ' I 



Example Suppose we define g : K — > M 2 by g(t) = (sin(27r£), cos(27rt)), —00 < t < 00. 
Then, as we saw in Section 2.1, g parametrizes the same circle C as / in the previous 
example. Moreover, 

Dg(t) = (27rcos(27rt), -27rsin(27rt)) 
and ||D^(t)|| = 1 for all t, so g is a smooth parametrization of C. However, 

V3 1 

T' 2 



that is, g(t) is at ( | ) when £ = |, whereas /(£) is at (^r, § j when t = | . Moreover, 



so the best affine approximation to g at t 



is 



£?(*) = (7r,-7rV3) (t-^) + ( 



V3 1 

~2~' 2 



Note that although A, the best affine approximation to / at t 



6 ' 



and 5, the best affine 



approximation to g at t = -f, are different functions, they parametrize the same line since 



(n, — n\/3) = —2n 



1 v 7 ^ 
2' ~2~ 
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Figure 2.2.2 Helix with tangent line at (2a/2, 2y/2, f ) 



Example Consider the helix C parametrized by / : K — > M 3 defined by 

f(t) = (4cos(t),4sin(t),t). 

Then 

Df(t) = (-4sin(i),4cos(t),l). 

Since Df is continuous and 

\\Df(t)\\ = ^/l6sin 2 (t) + 16 cos 2 (t) + 1 = Vl7 
for all t, f is a smooth parametrization of C. Now, for example, 

and 

so the best affine approximation to / at t = j is 

A(t) = (-2^2, 2V2, 1) (* - |) + (2V2, 2V2, j) . 

The helix C and the line parametrized by A, namely, the tangent line to C at t = ?, are 
shown in Figure 2.2.2. 
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Example Let C be the curve in IR 2 parametrized by 

h{t) = (cos 3 (t), sin 3 (t)). 

Then 

Dh(t) = (-3cos 2 (t) sin(t),3sin 2 (t) cos(t)). 

Hence Dh is continuous for all t, but h is not a smooth parametrization of C since Dh(t) = 
whenever t is an integer multiple of ^. These points correspond to the sharp corners of 
C at (1,0), (0, 1), (—1,0, and (0, —1), as shown in Figure 2.2.3. However, h is a smooth 
parametrization of the four arcs of C which are parametrized by restricting h to the open 
intervals (0, (f ,7r), (tt, 3 ^ l ), and (^-,2k). Hence, for example, noting that 



Dh 

and 



3tt 



4 / V 2^2 2^ 



4 y V 2V2' 2^y ' 

we find that the best affine approximation to h at t = ^ is 

m = ("271' ~2Vl) (* " t) + ("271' 271 



The tangent line parametrized by A is shown in Figure 2.2.3. 
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Proposition Suppose / : R ->■ M n , g : R ->■ M n , and : R ->■ R are all differentiable. 
Then 

D(f(t) + g(t)) = Df(t) + Dg(t), (2.2.19) 

D(f(t) - g{t)) = Df(t) - Dg(t), (2.2.20) 

D(cp(t)f(t)) = <p(t)Df(t) + <p'(t)f(t), (2.2.21) 

j t U{t) ■ g(t)) = f(t) ■ Dg(t) + Df(t) ■ g(t), (2.2.22) 

and 

D(f( V (t)) = Df(<p(t)) V '(t)). (2.2.23) 

Note that all of the statements in this proposition reduce to familiar results from 
one- variable calculus when n = 1. To verify these results, let 

f(t) = (f 1 (t)J 2 (t),...J n (t)) 

and 

g(t) = ( gi (t),g 2 (t),...,g n (t)). 

Then 



D(f(t) + g(t)) = D{h{t) + gi (t), f 2 (t) +g2(t),..., f n (t) + g n {t)) 
= (f[(t) + g'i(t),m) + g' 2 (t),...,f' n (t) + 9'n(t)) 

= (f[(t),m),...,m)+(g'i(t),9'2(t),...,9' n (t)) 

= Df(t) + Dg(t), 



(2.2.24) 



verifying (2.2.19). The verification of (2.1.20) is similar. The demonstrations of (2.2.21) 
and (2.1.22), both of which are generalizations of the product rule from one-variable calcu- 
lus, follow easily from that result; we will check (2.1.22) here and leave (2.2.21) for Problem 
13. Using the product rule, we have 

j t U{t) ■ g(t)) = f t (h(t)9i(t) + f2(t) 9 2(t) + ■■■ + f n (t)g n (t)) 

= fi(t)g[(t) + f[(t)gi(t) + f2(t)g' 2 (t) + f' 2 {t)g 2 {t) + ■■■ (2 . 2 .25) 

+ fn(t)g'Jt) + f n (t)g n (t) 
= f(t)>Dg(t) + Df(t)-g(t). 

Finally, (2.2.23), a generalization of the chain rule from one- variable calculus, follows 
directly from that result: 

D(f(<p(t))) = D{h{ v {t))J 2 { V {t)), f n (<p(t))) 

= (f[( V (tW(t), &( V (tW(t), . . . , fMtW®) (2.2.26) 
= Df( V (t)) V '(t). 
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Reparametrizations 

We have seen above that the parametrization of a curve C in M. n is not unique. For example, 
we saw that both f(t) = (cos(t), sin(t)) and g(t) = (sin(27rt), cos(2-7rt)) parametrize the unit 
circle centered at the origin. However, we also noted that the best affine approximations 
for the two parametrizations, although distinct functions, nevertheless parametrize the 

same line at ^j, the line we have been calling the tangent line. We should suspect 

that this will be the case in general, that is, the tangent line to a curve C at a particular 
point should not depend on the particular parametrization of C used in the computation. 
While avoiding some technicalities, we will provide some justification for these ideas. 

Definition Suppose x = f(t), a < t < 6, is a smooth parametrization of a curve C in 
]R n . Suppose f : R — > R has domain (c, d), range (a, 6), and ip' exists and is continuous on 
(c, d). If tp'{t) for all t in (c, d), then we call g(t) = f((p(t)) a reparametrization of /. 

Example Let f(t) = (cos(t), sin(t)) and g(t) — (sin(27rt), cos(27rt)). Since 

sin(t) = cos (J^ — tj 



and 



if follows that 



where 



cos(t) = sin ^ — , 

g(t) = f (| - 27rf) = /(*>(*)), 



<p{t) = I - 2nt. 



That is, g is a reparametrization of /. 

Now if x = /(f), a < t < b, is a smooth parametrization of a curve C in IR n and 
git) = f(ip{t)), c < t < <i, is a reparametrization of /, then for any a in (c, rf), 

£>(/(a) = D(f(<p(a)) = Df((p(a))(p'(a). (2.2.27) 

Hence Dg(ot) and Df((p(a)) are parallel, the former being the latter multiplied by the 
scalar (p'(a). In other words, the lines parametrized by the best affine approximation to g 
at t = a and the best affine approximation to / at t = tp(a) are the same. 

Example In our previous example, we have 

<{/ it) = -2tt, 

so, for any a, we should have 

Dg(a) = -2nDf(ip(a)). 
This agrees with our previous calculation using a = h. 
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Tangent and normal vectors 

If / : R — > R n is a smooth parametrization of a curve C, then, for any t, Df(t) is the 
direction of the tangent line to C at /(£). Moreover, from our discussion above, if g is a 
reparametrization of /, say, g(t) = /((/?(£)), then Dg{t) and Df(<p(t)) will have the same 
or opposite direction. In other words, the direction of the tangent line either remains the 
same or is reversed under reparametrization. On the other hand, 

\\Dg(t)\\ = \\Df(<p(t))\\\ V '(t)\. (2.2.28) 

As we should expect, although both Dg(t) and Df(cp(T)) are tangent to the curve at g(t), 
their lengths do not have to be the same. In Section 2.3 we will discuss how we may think 
of this in terms of the speed of a particle moving along the curve C, with its position on 
C at time t given by either g(t) or /(£). 

For these and other considerations, it is useful to define a standard tangent vector, 
unique up to a change in sign. 

Definition If / : R — >■ IR n is a smooth parametrization of a curve C, then we call 

TW = (2.2,9) 

the unit tangent vector to C at f(t). 

From the preceding, we must keep in mind that the unit tangent vector T(t) is always 
in reference to some parametrization / of the curve C. Essentially, this is a choice of an 
orientation for the curve, that is, the direction of motion for a particle whose position at 
time t is given by f(t). 

If x = f(t), a < t < b, is a smooth parametrization of a curve C in IR n , then, by 
definition, ||T(i)|| = 1 for all t in (a, b). Hence 

T(t)-T(t) = l (2.2.30) 

for all t in (a, b). Differentiating (2.2.30), we have 

SP 1 W-T(«))=|1 = 0, (2.2.31) 

and so, using (2.2.22), we have 

= ^-(T(t) ■ T(t)) = T(t) ■ DT(t) + DT(t) ■ T(t) = 2DT(t) ■ T(t) (2.2.32) 

tit- 



for all t in (a, b). Thus T(t) ■ DT(t) = for a < t < b. In other words, DT(t) is orthogonal 
to T(t) for all t in (a, b). 
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Definition If / : R — > R n is a smooth parametrization of a curve C, T(t) is the unit 
tangent vector to C at f(t), and DT(t) ^ 0, then we call 

DT(t) 

N(t) = n Z± „ (2.2.33) 



the principal unit normal vector to C at /(t). 

Example Consid 
the origin given by 



Example Consider the parametrization of the circle in IR 2 with radius 2 and center at 



f(t) = (2cos(4£),2sin(4t)). 

Then 

Df(t) = (-8sin(4£),8cos(4t)) 

and 



\\Df(t)\\ = yj 64sin 2 (4t) + 64cos 2 (4t) = 8. 
Thus the unit tangent vector is 

T(t) = p7M = (_sin(4t) ' cos(4t)) - 

Moreover, 

DT(t) = (-4cos(£),-4sin(4t)), 

so 

\\DT(t)\\ = y / 16cos 2 (4t) + 16 sin 2 (4) = 4, 
and the principal unit normal vector is 

DT(t) 

N{t) = JrnWl = (- cos ^)'- sin ( 4t ))- 

For example, when t = ^ we have 

/(£) = (V5,D, 
T (!_) = ( A ^ 

\2AJ 2' 2 



and 
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Note that, for any value of t, f(t) _L T(t), T(t) _L N(t) (as is always the case), and 
N(t) = -§/(t). See Figure 2.2.4. 

Example Consider the elliptical helix H parametrized by 

g(t) = (cos(t),2sin(£),t). 

Then 

Dg(t) = (-sin(t),2cos(t),l), 

so 

P<7(t)|| = y / sin 2 (t)+4cos 2 (t) + l 

= ^sm 2 (t) + cos 2 (t) + 3cos 2 (t) + 1 

= ^2 + |(l + cos(2t)) 




7 + 3cos(2t) 
2 



Hence the unit tangent vector is 
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Figure 2.2.5 An elliptical helix with unit tangent and normal vectors 



Differentiating using (2.2.21), we have 



DT(t) 



7 + 3cos(2t) 
1 ( 2 



+ 



cos(t),-2sin(t),0) 

12sin(2i) 



2 V. 7 + 3 cos(2t) J V. (7 + 3 cos(2t)) 2 



(-sin(t),2cos(t), 1) 



7 + 3cos(2t) 
For example, at t = f we have 



2 . . , . . 3\/2sin(2t) , . . , . . 

-cos(t),-2sm(t),0) H (- sm(t), 2 cos(t), 1) 



(7 + 3cos(2t)) 



7T 



and 



I ' j) = -7= (-1, -2, 0) + ^ (-1, 2, V2) = (-10, -8, 3^2 
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Thus 



DT 



7^7 



so the principal unit normal vector at t 



VlOO + 64 + 18 = 



26 



is 



N 



(i) 



|DT(f 



Vl82 



(-10,-8,3^2). 



See Figure 2.2.5. 

As the last example shows, the computations involved in finding the unit tangent 
vector and the principal unit normal vector can become involved. In fact, that is why 
we computed the principal unit normal vector only in the particular case t = \ instead 
of writing out the general formula for N(t). In general these computations can become 
involved enough that it is often wise to make use of a computer algebra system. 



Problems 



1. Find the derivative of each of the following functions. 

(a) /(£) = (t 3 ,t,2t + 4) (b) g(t) = (3t cos(2t), 4t sin(2t)) 

(c) h(t) = (4t 3 - 3, sin(t), e~ 2t ) (d) fit) = (e~* sin(3£), e~* cos(3£), te _t ) 

2. For each of the following, find the best affine approximation to / at the given point, 
(a) /(*) = (M 3 ), t = 2 (b) /(*) = (3sin(2t),4cos(2t)), t = | 

(c) /(£) = (cos(t),sin(t),cos(2t)), t= ^ (d) f(t) = (2 cos(2£), 3 sin(2t), St), t = 

3. Let fit) = (2cos(7rt),3sin(7rf)) parametrize an ellipse E in M 2 . Plot E along with the 
tangent line at / (|). 

4. Let fit) = ((l + 2cos(t))cos(t), (1 + 2 cos(t)) sin(t)) parametrize a curve C in 1R 2 . Plot 
C along with the tangent line at /(f)- 

5. Let h(t) = (sin(27rt), cos(27rt), |) parametrize a circular helix H in M 3 . Plot H along 
with the tangent line at h (|). 

6. Let = (cos(-7rf), y/i, sin(7rt)) parametrize a curve C in M 3 . Plot C along with the 
tangent line at g (|). 

7. Suppose / : R ->■ M 2 is defined by /(*) = (t,<p(t)), where : R ->■ R is differentiate, 
and let C be the curve in M 2 parametrized by /. Show that the tangent line to C at 
f(c) is the same as the line tangent to the graph of (p at (c, (p(c). 

8. Let C be the curve in M 2 parametrized by /(£) = (t 3 , t 6 ), — oo < t < oo. Is / a smooth 
parametrization of C? If not, can you find a smooth parametrization of C? 
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9. Let C be the curve in M 2 parametrized by /(£) = (t 2 ,t 2 ), — oo < t < oo. Show that / 
is not a smooth parametrization of C. Where is the problem? Plot C and identify the 
location of the problem. 

10. Let v ^ and p be vectors in IR n and let C be the curve in R n parametrized by 
fit) = tv + p. What is the best affine approximation to / at t = t$! 

11. For each of the following, find the unit tangent vector and the principal unit normal 
vector at the indicated point. 



(a) 


fit) 


= (M 2 ), * = 1 


(b) 


9it) = 


(3sin(2t),3cos(2t)), t = ^ 


(c) 


fit) 


= (2cos(t),4sin(t)), t= j 


(d) 


h(t) = 


3 

(cos(7r£), 2 sin(7rt)), t = - 


(e) 


git) 


TT 

= (cos(t),sin(t),t), t = - 


(f) 


fit) = 


(2sin(t),3cos(2t),2t), t = ^ 


(g) 


fit) 


= (sin(7rt), — cos(7rt), 3t), t = — 


(h) 


9it) = 


(cos(7rf 2 ),sin(7rf 2 ),t 2 ), t = 1 


(i) 


fit) 


= (t,t 2 ,t 3 ), t = 2 









12. Use the fact that f(t) = (bcos(t), 6sin(£)) parametrizes a circle of radius b to show that 
a radius of a circle is always perpendicular to the tangent line at the point where the 
radius touches the circle. 

13. Verify (2.2.21); that is, show that if / : K -»■ R n and <p : K -»■ K are both differentiable, 
then 

D(<p(t)f(t))=<p(t)Df(t) + <p'(t)f(t). 

14. Suppose / : R R 3 and g : R R 3 are both differentiable. Show that 

D(f(t) x «?(£)) = /(t) x Dg(t) + Df(t) x g(t), 

yet another version of the product rule. 

15. The following figure illustrates a curve in R 2 parametrized by some function / : R — > 
R 2 . If T is the unit tangent vector at the indicated point on the curve, then either M 
or N is the principal unit normal vector at that point. Which one is it? 

T 
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Velocity and acceleration 

Consider a particle moving in space so that its position at time t is given by x(£). We 
think of x(t) as moving along a curve C parametrized by a function /, where / : R — > R n . 
Hence we have x(t) = f(t), or, more simply, x = f(t). For us, n will always be 2 or 3, but 
there are physical situations in which it is reasonable to have larger values of n, and most 
of what we do in this section will apply to those cases equally well. This is also a good 
time to introduce the Leibniz notation for a derivative, thus writing 



At a given time to, the vector x(t + h) — x(t ) represents the magnitude and direction 
of the change of position of the particle along C from time to to time to + h, as shown in 
Figure 2.3.1. Dividing by h, we obtain a vector, 



with the same direction, but with length approximating the average speed of the particle 
over the time interval from to to to + h. Assuming differentiability and taking the limit as 
h approaches 0, we have the following definition. 



ctx 

~dt 



Df(t). 



(2.3.1) 



x(t + h) - x(t ) 
h 



(2.3.2) 




x(t +h) - x(f ) 



Figure 2.3.1 Motion along a curve C 
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2 Motion Along a Curve Section 2.3 

Definition Suppose x(t) is the position of a particle at time t moving along a curve C 
in R n . We call 

v(f) = |x(t) (2.3.3) 
the velocity of the partial at time t and we call 

s(f) = ||v(f)|| (2.3.4) 
the speed of the particle at time t. Moreover, we call 

a(f) = |v(t) (2.3.5) 
the acceleration of the particle at time t. 

Example Consider a particle moving along an ellipse so that its position at any time t 
is 

x = (2 cos(t), sin(i)). 

Then its velocity is 

v = (—2 sin(t), cos(t)), 

its speed is 



s = ^4sin 2 (t) + cos 2 (t) = ^3sin 2 (t) + 1, 



and its acceleration is 

a = (—2 cos(t), — sin(t)). 

For example, at t = \ we have 

1 



x 



t— — 



v 



t— — 



and 



a 



t— — 

l ~ 4 



See Figure 2.3.2. Notice that, in this examples, a = — x for all values of t. 




Curvature 

Suppose x is the position, v is the velocity, s is the speed, and a is the acceleration, at 
time t, of a particle moving along a curve C. Let T(t) be the unit tangent vector and N(t) 
be the principal unit normal vector at x. Now 



T(t) 



<ix 

~dl 



v 

s 



so 



Thus 



Since 



we have 



dx 
~dt 



v = s\\T(t)\\. 



dv d rr ,, . ds m/ . . 
a = — = — sT(t) = — T(t) + sDT(t) 



dt dt 

N(t) 
ds 

a =-T(t) + s\\DT(t)\\N(t). 



dt 

DT(t) 
\DT 



(2.3.6) 

(2.3.7) 
(2.3.8) 

(2.3.9) 
(2.3.10) 



Note that (2.3.10) expresses the acceleration of a particle as the sum of scalar multiples 
of the unit tangent vector and the principal unit normal vector. That is, 



where 



and 



a = a T T(t) + a N N(t), 
ds 



ax 



dt 



a N = s\\DT(t)\\. 



(2.3.11) 

(2.3.12) 
(2.3.13) 
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However, since T(t) and N(t) are orthogonal unit vectors, we also have 

a • T(t) = (a T T(t) + a N N(t)) ■ T(t) 

= a T (T(t) ■ T(t)) + a N (T(t) ■ N(t)) (2.3.14) 
= a,T 

and 

a ■ N(t) = (a T T(t) + a N N(t)) ■ N(t) 

= a T (T(t) ■ N(t)) + a N (N(t) ■ N(t)) (2.3.15) 
= a N . 

Hence ay is the coordinate of a in the direction of T(t) and is the coordinate of a in the 
direction of N(t). Thus (2.3.10) writes the acceleration as a sum of its component in the 
direction of the unit tangent vector and its component in the direction of the principal unit 
normal vector. In particular, this shows that the acceleration lies in the plane determined 
by T(t) and N(t). Moreover, ay is the rate of change of speed, while is the product of 
the speed s and ||£)T(t)||, the magnitude of the rate of change of the unit tangent vector. 
Since ||T(i)|| = 1 for all t, \\DT(t)\\ reflects only the rate at which the direction of T(t) 
is changing; in other words, ||DT(t)|| is a measurement of how fast the direction of the 
particle moving along the curve C is changing at time t. If we divide this by the speed 
of the particle, we obtain a standard measurement of the rate of change of direction of C 
itself. 

Definition Given a curve C with smooth parametrization x = /(£), we call 

K _ \\DT(t)\\ 
s(t) 

the curvature of C at fit). 

Using (2.3.16), we can rewrite (2.3.10) as 

ds 

a = — T(t) + s 2 KN(t). (2.3.17) 

Hence the coordinate of acceleration in the direction of the tangent vector is the rate of 
change of the speed and the coordinate of acceleration in the direction of the principal 
normal vector is the square of the speed times the curvature. Thus the greater the speed 
or the tighter the curve, the larger the size of the normal component of acceleration; the 
greater the rate at which speed is increasing, the greater the tangential component of 
acceleration. This is why drivers are advised to slow down while approaching a curve, and 
then to accelerate while driving through the curve. 

Example Suppose a particle moves along a line in M n so that its position at any time t 
is given by 

x = tw + p, 



(2.3.16) 
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where w^O and p are vectors in M. n . Then the particle has velocity 

<ix 

v = — = w 
dt 

and speed s = ||w||, so the unit tangent vector is 

T(t) = - = 



s || w || 

Hence T(t) is a constant vector, so DT(t) = and 

K= \mm =0 

s 

for all t. In other words, a line has zero curvature, as we should expect since the tangent 
vector never changes direction. 

Example Consider a particle moving along a circle C in M 2 of radius r > and center 
(a, 6), with its position at time given by 

x = (r cos(£) + a, r sin(£) + b). 

Then its velocity, speed, and acceleration are 

v = (— r sin(£), r cos(t)), 



s = y r 2 sin*- 1) + r 2 cos 2 (t) = r 

and 

a= (— rcos(f), — r sin(£)), 
respectively. Hence the unit tangent vector is 

Tit) = - = (-sin(t),cos(t)). 
s 

Thus 

DT(t) = (-cos(t),-sin(t)) 

and 



\\DT(t)\\ = ycos 2 ^) + sin 2 (t) = 1. 
Hence the curvature of C is, for all t, 

\DT(t)\\ 1 



K 

s r 
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Thus a circle has constant curvature, namely, the reciprocal of the radius of the circle. In 
particular, the larger the radius of a circle, the smaller the curvature. Also, note that 

ds d 

— = —r = 0, 

dt dt 

so, from (2.3.10), we have 

a = rN(t), 

which we can verify directly. That is, the acceleration has a normal component, but no 
tangential component. 

Example Now consider a particle moving along an ellipse E so that its position at any 
time t is 

x = (2 cos(t), sin(t)). 
Then, as we saw above, the velocity and speed of the particle are 

v = (—2 sin(t), cos(t)) 

and 

s= v / 3sin( 2 ) + l, 

respectively. For purposes of differentiation, it will be helpful to rewrite s as 



/3, , xx /5-3cos(2t) 

*=t/-(l-cos(2f)) + l = 



Then the unit tangent vector is 



r(t) = '/ 5 -3cos(2 t ) (-»*>(«>■".(«))■ 



Thus 



D'nt) = J , (-2cos(t),-sin(t)) - 3^sin(2t) 2gin q 

"5-3 cos(2t) (5-3 cos(2t)) 2 



So, for example, at t = ^, we have 



x 



t— - 



A 7i 



v 



t— — 



s 



*=7 V2' 
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n 



V5 



DT 



(!) 



5^ 



-2,1), 



(4,8), 



and 



Mi) 

Hence the curvature of E at ^v^, 



-^16 + 64 = \. 
5\/5 5 



is 



4 
5 



— 



4^ 
5^ 



0.05060, 



where the final numerical value has been rounded to four decimal places. Although the 
general expression for k is complicated, it is easily computed and plotted using a computer 
algebra system, as shown in Figure 2.3.3. Comparing this with the plot of this ellipse in 
Figure 2.3.2, we can see why the curvature is greatest around (2,0) and (—2,0), corre- 
sponding to t = 0, t = 7r, and t = 2n, and smallest at (0, 1) and (0, —1), corresponding to 
t = f and t=^. Finally, as we saw above, the acceleration of the particle is 

a= (— 2cos(£), — sin(£)), 



so 



Now if we write 



\t=- 



a T T(t) + a N N(t), 



then we may either compute, using (2.3.17), 

1 



ax 



ds 
dt 



V2 



and 



ajv = s 

or, using (2.3.14) and (2.3.15), 
a,T = a| . „ • T 

and 



(5-3cos(2t))-2(3sin(2t) 



5 4^2 2^2 



10 



t— — 
1 4 



(!)-(-*- 



2 5^ 

1 \ 1 



10 



/7T\ 



1 



1 



10 

4 
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Arc length 

Suppose a particle moves along a curve C in WL n so that its position at time t is given by 
x = f(t) and let D be the distance traveled by the particle from time t = a to t = b. We 
will suppose that s(t) = ||v(t)|| is continuous on [a, b]. To approximate D, we divide [a,b] 
into n subintervals, each of length 

b — a 

At= , 

n 

and label the endpoints of the subintervals a = to, ti, . . . , t n = b. If At is small, then the 
distance the particle travels during the jth subinterval, j = 1, 2, . . . , n, should be, approx- 
imately, sAt, an approximation which improves as At decreases. Hence, for sufficiently 
small At (equivalently, sufficiently large n), 

n 

^s(tj_i)At (2.3.18) 

3=1 

will provide an approximation as close to D as desired. That is, we should define 

n 

D= lim Vs(Vi)At. (2.3.19) 

n— »oo ' ^ 
3=1 

But (2.3.18) is a Riemann sum (in particular, a left-hand rule sum) which approximates 
the definite integral 

-b 

s(t)dt. (2.3.20) 
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Hence the limit in (2.3.19) is the value of the definite integral (2.3.20), and so we have the 
following definition. 

Definition Suppose a particle moves along a curve C in R n so that its position at time 
t is given by x = /(£). Suppose the velocity v(t) is continuous on the interval [a, b}. Then 
we define the distance traveled by the particle from time t = a to time t = b to be 

/ ||v(t)||dt. (2.3.21) 

J a 

Note that the distance traveled is the length of the curve C if the particle traverses C 
exactly once. In that case, we call (2.3.21) the length of C. In general, for any t such that 
the interval [a,t] is in the domain of /, we may calculate 

a(t) = f ||v(u)||du, (2.3.22) 

J a 

which we call the arc length function for C. 

Example Consider the helix H parametrized by 

/(£) = (cos(£),sin(£),£). 

If we let L denote the length of one complete loop of the helix, then a particle traveling 
along H according to x = /(£) will traverse this distance as t goes from to 2ir. Since 

v(t) = (— sin(t), cos(t), 1), 

we have 

||v(t)|| = ^sin 2 ^) + cos 2 (t) + 1 = y/2. 

Hence 

L= I V2dt = 2V2tt. 
Jo 

Example Suppose a particle moves along a curve C so that its position at time t is given 
by 

x = ((1 + 2cos(t)) cos(t), (1 + 2cos(t))sin(t)). 

Then C is the curve in Figure 2.3.4, which is called a limagon. The particle will traverse 
this curve once as t goes from to 2n. Now 

v = (-(1 + 2 cos(t)) sin(t) - 2 sin(t) cos(t), (1 + 2 cos(t)) cos(t) - 2 sin 2 (t)), 

so 

||v|| 2 = V • V 

= (1 + 2cos(t)) 2 sin 2 (t) +4(1 + 2 cos(t)) sin 2 (t) cos(t) + 4 sin 2 (t) cos 2 (t) 

+ (1 + 2 cos(t)) 2 cos 2 (t) -4(1 + 2 cos(t)) sin 2 (t) cos(t) + 4 sin 4 (t) , 
= (1 + 2cos(t)) 2 (sin 2 (t) +cos 2 (t)) +4sin 2 (t) cos 2 (t) + 4 sin 4 (t) 
= (1 + 2cos(t)) 2 + 4sin 2 (t) cos 2 (t) +4sin 4 (t) 
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Figure 2.3.4 A limagon 



Hence the length of C is 

r.2?r 



J yj{l + 2cos(t)) 2 + 4sin 2 (t) cos 2 (t) + 4 sin 4 (t) dt = 13.3649, 

where the integration was performed with a computer and the final result rounded to four 
decimal places. Note that integrating from to 47r would find the distance the particle 
travels in going around C twice, namely, 

/ J(l + 2 cos(t)) 2 + 4 sin 2 (t) cos 2 (t) + 4 sin 4 (t) dt = 26.7298. 
Jo 



Problems 



1. For each of the following, suppose a particle is moving along a curve so that its position 
at time t is given by x = /(£). Find the velocity and acceleration of the particle. 

(a) fit) = (t 2 + 3, sin(t)) (b) fit) = (t 2 e" 2 *, t 3 e~ 2 \ 3t) 

(c) f{t) = (cos(3t 2 ),sin(3t 2 )) (d) fit) = it cos(t 2 ), t sin(t 2 ), 3t cos(t 2 )) 

2. Find the curvature of the following curves at the given point. 

(a) /(*) = (t,t 2 ), t = 1 (b) f{t) = (3cos(£),sin(£)), t = | 
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(c) /(*) = (cos(*),sin(*), *),*=- (d) /(t) = (cos(t),sin(t),e- t ),t = 

3. Plot the curvature for each of the following curves over the given interval 7. 

(a) /(t) = (M 2 ),/=[~2,2] 

(b) /(£) = (cos(t),3sin(£)), / = [0, 2tt] 

(c) g(t) = ((l + 2cos(t))cos(t),(l + 2cos(t))sin(t)), 7 = [0,2tt] 

(d) = (2 cos(t), sin(t), 27), 7 = [0, 2tt] 

(e) /(£) = (4cos(£) + sin(4£),4sin(£) + sin(4t)), 7= [0,2tt] 

4. For each of the following, suppose a particle is moving along a curve so that its position 
at time t is given by x = /(£). Find the coordinates of acceleration in the direction of 
the unit tangent vector and in the direction of the principal unit normal vector at the 
specified point. Write the acceleration as a sum of scalar multiples of the unit tangent 
vector and the principal unit normal vector. 

(a) /(£) = (sin(£),cos(t)), t = | (b) /(£) = (cos(t), 3 sin(t)), t = j 

(c) f(t) = (t,t 2 ), t=l (d) f(t) = (sin(t),cos(£),£), t = | 

5. Suppose a particle moves along a curve C in IR 3 so that its position at time t is given by 
x = f(t). Let v, s, and a denote the velocity, speed, and acceleration of the particle, 
respectively, and let k be the curvature of C. 

(a) Using the facts v = sT(t) and 

a = ^ T(t) + s 2 KN(t), 

show that 

v x a = s 3 K(T(t) x N(t)). 

(b) Use the result of part (a) to show that 

II v x a|| 

« = n ||3 ■ 
ll v ll 

6. Let 77 be the helix in M 3 parametrized by fit) = (cos(£), sin(t), t). Use the result from 
Problem 5 to compute the curvature k of 77 for any time t. 

7. Let C be the elliptical helix in M 3 parametrized by fit) = (4 cos(t), 2 sin(f), t). Use the 
result from Problem 5 to compute the curvature k of C at t = \. 

8. Let C be the curve in IR 2 which is the graph of the function (p : R — > R. Use the result 
from Problem 5 to show that the curvature of C at the point (t, (pit)) is 
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9. Let P be the graph of /(£) = t 2 . Use the result from Problem 8 to find the curvature 
of P at (1,1) and (2,4). 

10. Let C be the graph of /(£) = t s . Use the result from Problem 8 to find the curvature 
of C at (1,1) and (2,8). 

11. Let C be the graph of g(t) = sin(£). Use the result from Problem 8 to find the curvature 
of Cat (f,l) and (f,^)- 

12. For each of the following, suppose a particle is moving along a curve so that its position 
at time t is given by x = f(t). Find the distance traveled by the particle over the given 
time interval. 

(a) /(£) = (sin(£),3cos(£)), 1 = [0, 2tt] 

(b) /(£) = (cos(7rt),sin(7rt),2£), I = [0,4] 

(c) /(t) = (M 2 ),/=[0,2] 

(d) /(£) = (£cos(£),£sin(£)), I = [0,2tt] 

(e) f(t) = (cos(27rt),sin(27rt),3t 2 ,t), I = [0,1] 

(f) f(t) = ( e - t cos(7rt),e- t sin(7rt)), I = [-2,2] 

( g ) /(£) = (4cos(£) +sin(4t),4sin(t) + sin(4£)), I = [0, 2tt] 

13. Verify that the circumference of a circle of radius r is 2nr. 

14. The curve parametrized by 

f(t) = (sin(2£) cos(t),sin(2t) sin(t)) 

has four "petals." Find the length of one of these petals. 

15. The curve C parametrized by h(t) = (cos 3 (£), sin 3 (£)) is called a hypocycloid (see 
Figure 2.2.3 in Section 2.3). Find the length of C. 

16. Suppose (p : R — >■ R is continuously differentiable and let C be the part of the graph of 
if over the interval [a, b]. Show that the length of C is 

f y^T+WW dt. 

J a 

17. Use the result from Problem 16 to find the length of one arch of the graph of f(t) = 
sin(t). 

18. Let h : R — > IR n parametrize a curve C. We say C is parametrized by arc length if 
\\Dh{t)\\ = 1 for all t. 

(a) Let cr be the arc length function for C using the parametrization / and let a -1 be its 
inverse function. Show that the function g : R — > W 1 defined by g(u) = f(a~ 1 (u)) 
parametrizes C by arc length. 
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(b) Let C be the circular helix in M 3 with parametrization f(t) = (cos (£), sin (£),£). 
Find a function g : R — > M. n which parametrizes C by arc length. 

19. Suppose / : R — > M. n is continuous on the closed interval [a, b] and has coordinate 
functions /i, /b, • • • , f n - We define the definite integral of / over the interval [a, b] to 

j f(t)dt=lf h(t)dt, [ f 2 (t)dt,..., [ f n (t)dt 

Show that if a particle moves so its velocity at time t is v(i), then, assuming v is a 
continuous function on an interval [a, b], the position of the particle for any time t in 
[a, b] is given by 



x(t) = / v(s)ds + x(a). 

J a 



20. Suppose a particle moves along a curve in M 3 so that its velocity at any time t is 

v(t) = (cos(2t),sin(2t),3t). 

If the particle is at (0, 1,0) when t = 0, use Problem 19 to determine its position for 
any other time t. 

21. Suppose a particle moves along a curve in IR 3 so that its acceleration at any time t is 

a(£) = (cos(t),sin(t),0). 

If the particle is at (1,2,0) with velocity (0,1,1) at time t = 0, use Problem 19 to 
determine its position for any other time t. 

22. Suppose a projectile is fired from the ground at an angle a with an initial speed 
i>o, as shown in Figure 2.3.5. Let x(t), v(t), and a(t) be the position, velocity, and 
acceleration, respectively, of the projectile at time t. 




Figure 2.3.5 The path of a projectile 
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(a) Explain why x(0) = (0,0), v(0) = (vo cos(a), vq sin(a)), and a(t) = (0, — g) for all 
t, where g = 9.8 meters per second per second is the acceleration due to gravity. 

(b) Use Problem 19 to find v(t). 

(c) Use Problem 19 to find x(t). 

(d) Show that the curve parametrized by x(t) is a parabola. That is, let x(t) = (x, y) 
and show that y = ax 2 + bx + c for some constants a, 6, and c. 

(e) Show that the range of the projectile, that is, the horizontal distance traveled, is 

_ Do sin(2o;) 



and conclude that the range is maximized when a = ^. 

(f) When does the projectile hit the ground? 

(g) What is the maximum height reached by the projectile? When does it reach this 
height? 

23. Suppose ai, a2, . . . , a m are unit vectors in IR n , m < n, which are mutually orthogonal 
(that is, ai _L aj when i ^ j). If x is a vector in IR n with 

x = xiai + x 2 a 2 H h x m a m , 

show that = x • a i? z = 1, 2, . . . , m. 
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In this chapter we will study functions / : IR n — > K, functions which take vectors for inputs 
and give scalars for outputs. For example, the function that takes a point in space for 
input and gives back the temperature at that point is such a function; the function that 
reports the gross national product of a country is another such function. Note that the 
domain space of the first example is three-dimensional, while the domain of the latter has, 
for most countries, thousands of dimensions. As usual, whenever possible we will state our 
results for an arbitrary n-dimensional space, although most of our examples will deal with 
only two or three dimensions. 

Level sets and graphs 

We begin by considering some geometrical methods for picturing functions of the form 
/ : R n -> R. 

Definition Given a function / : IR n — > R and a real number c, we call the set 

L = {(xi,x 2 , ...,x n ): f(xi,X2, ■■■ ,x n ) = c} (3.1.1) 

a level set of / at level c. We also call L a contour of /. When n = 2, we call L a level 
curve of / and when n = 3 we call L a level surface of /. A plot displaying level sets for 
several different levels is called a contour plot. 

Example Suppose / : IR 2 — > R is defined by 

f(x,y) = 2x 2 + y 2 . 



Given a real number c, the set of all points satisfying 

2x 2 + y 2 = c 



is a level set of /. For c < 0, this set is empty; for c = 0, it consists of only the point (0, 0); 
for any c > 0, the level set is an ellipse with center at (0, 0). Hence a contour plot of /, as 
shown in Figure 3.1.1, consists of concentric ellipses. 

Example Suppose / : IR 2 — > W is defined by 



f(z,y) 



sin(^/a; 2 + y 1 
\/x 2 + y 2 
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Figure 3.1.1 Level curves 2x 2 + y 2 = c 

For any point (x,y) on the circle of radius r > centered at the origin, f(x,y) has the 
constant value 

sin(r) 
r 

Hence a contour plot of /, like that shown in Figure 3.1.2, consists of concentric circles 
centered at the origin. 

Example Suppose / : M 3 — > W is defined by 

f(x,y,z) = x 2 + 2y 2 + 3z 2 . 
The level surface of / with equation 

x 2 + 2y 2 + 3z 2 = 1 

is shown in Figure 3.1.3. Note that, for example, fixing a value zq of z yields the equation 

x 2 + y 2 = 1 — 3zq, 
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the equation of an ellipse. This explains why a slice of the level surface shown in Figure 
3.1.3 parallel to the xy-plane is an ellipse. Similarly, slices parallel to the res-plane and the 
yz-plane are ellipses, which is why this surface is an example of an ellipsoid. 




Figure 3.1.3 The level surface x 2 + 2y 2 + 3z 2 = 1 
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Figure 3.1.4 The paraboloid z = 2x 2 + y 2 

Definition Given a function / : IR n — > R, we call the set 

) : x n+1 = f(xi )} (3-1.2) 

the graph of /. 

Note that the graph G of a function / : IR n — > K is in R n+1 . As a consequence, we can 
picture G only if n = 1, in which case G is a curve as studied in single- variable calculus, 
or 7i = 2, in which case G is a surface in IR 3 . 
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Example Consider the function / : IR 2 — > K defined by 

f(x,y) = 2x 2 + y 2 . 

The graph of / is then the set of all points (x,y,z) in IR 3 which satisfy the equation 
z = 2x 2 + y 2 . One way to picture the graph of / is to imagine raising the level curves 
in Figure 3.1.1 to their respective heights above the xy-plane, creating the surface in IR 3 
shown in Figure 3.1.4. Another way to picture the graph is to consider slices of the graph 
lying above a grid of lines parallel to the axes in the xy-plane. For example, for a fixed 
value of x, say xq, the set of points satisfying the equation + y 2 is a parabola lying 

above the line x = xq. Similarly, fixing a value yo of y yields the parabola z = 2x 2 + yo 
lying above the line y = yo. If we draw these parabolas for numerous lines of the form 
x = xq and y = yo, we obtain a wire- frame of the graph. The graph shown in Figure 3.1.4 
was obtained by filling in the surface patches of a wire-frame mesh, the outline of which is 
visible on the surface. This surface is an example of a paraboloid. 




Example Although the graphs of many functions may be sketched reasonably well by 
hand using the ideas of the previous example, for most functions a good picture of its graph 
requires either computer graphics or considerable artistic skill. For example, consider the 
graph of 

sm(^x 2 +y 2 ) 

f(x,y) - 



\/x 2 + y 2 



Using the contour plot, we can imagine how the graph of / oscillates as we move away 
from the origin, the level circles of the contour plot rising and falling with the oscillations 
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of 

sin(r) 

i 

r 

where r = \/x 2 + y 2 . Equivalently, the slice of the graph above any line through the origin 
will be the graph of 

sin(r) 

z = . 

r 

This should give you a good idea what the graph of / looks like, but, nevertheless, most of 
us could not produce the picture of Figure 3.1.5 without the aid of a computer. Notice that 
although / is not defined at (0, 0), it appears that f(x, y) approaches 1 as (x, y) approaches 
0. This is in fact true, a consequence of the fact that 

lim ^ = 1. 

i — s-o r 

We will return to this example after we have introduced limits and continuity. 
Limits and continuity 

By now the following two definitions should look familiar. 

Definition Let a be a point in R n and let O be the set of all points in the open ball of 
radius r > centered at c except c itself. That is, 

O = {x : x is in B n (c, r), x ^ c}. (3.1.3) 

Suppose / : lR n — > R is defined for all x in O. We say the limit of /(x) as x approaches c 
is L, written lim /(x) = L, if for every sequence of points {x m } in O, 

x— s-c 

lim /(x m ) = L (3.1.4) 

m— S-oo 

whenever lim x m = c. 

m— S-oo 

Definition Suppose / : lR n — > R is defined for all x in some open ball B n (c, r), r > . 
We say / is continuous at c if 

lim /(x) = /(c). (3.1.5) 

The following basic properties of limits follow immediately from the analogous prop- 
erties for limits of sequences. 

Proposition Suppose / : R n R and g : R n R with 

lim /(x) = L 

and 

lim g(x) = M. 
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lim(/(x) + <7(x)) = L + M, 

x— 

lim(/(x)-<7(x)) = L-M, 

x— s-c 

lim /(x)^(x) = LM, 

hm — — = — , 
x^c g(x) M 



lim fc/(x) = /cL 

x— s-c 



and 

for any scalar k. 

Now suppose / : R n ->■ R, : R ->■ R, 

lim /(x) = L, 

x— >-c 

and /i is continuous at L. Then for any sequence {x m } in IR n with 



we have 



and so 



lim x m = c, 



lim /(x m ) = L, 



lim fr(/(x m )) = M^) 



(3.1.6) 
(3.1.7) 
(3.1.8) 

(3.1.9) 
(3.1.10) 



(3.1.11) 

(3.1.12) 

(3.1.13) 
(3.1.14) 



by the continuity of h at L. Thus we have the following result about compositions of 
functions. 

Proposition If / : R n ->■ R, h : R ->■ R, 

lim /(x) = L, 

x— >-c 

and /i is continuous at L, then 

lim /i o /(x) = lim fr(/(x)) = /i(L). (3.1.15) 



Example Suppose we define / : lR n — > R by 

f(x t ,x 2 , ...,x n ) = x k , 

where k is a fixed integer between 1 and n. If a = (ai, a2, . . . , a n ) is a point in R n and 
lim x m = a, then 

lim /(x m ) = lim x mk = a k , 

m— >-oo m— S-oo 
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where x m k is the kth coordinate of x m . Thus 

lim /(x) = a k . 



x— »a 



This result is a basic building block for the examples that follow. For a particular example, 
if f(x, y) = x, then 

lim fix, y) = lim x = 2. 

(aM/)-K2,3) (x,y)^(2,3) 

Example If we define / : R 3 ->■ R by 

f(x,y,z) = xyz, 
then, using (3.1.8) in combination with the previous example, 
lim f(x,y,z) = lim xyz 

(x,y,z)->(a,b,c) (x,y,z)->(a,b,c) 

= ( lim x)( lim y)( lim z) 

(x,y,z)->(a,b,c) (x,y,z)->(a,b,c) (x,y,z)->(a,b,c) 

= abc. 

for any point (a, b, c) in IR 3 . For example, 

lim f(x,y,z)= lim xyz = (3)(2)(1) = 6. 

Example Combining the previous examples with (3.1.6), (3.1.7), (3.1.8), and (3.1.10), 
we have 

lim (xy 2 + 3xyz — 6xz) = ( lim x)( lim y)( lim y) 

(x,y,z)^(2,l,3) (sb )W ,*)-^(2,1,3) (x,y,z)^(2,l,3) (x,y,*)-K2,l,3) 

+ 3( lim x)( lim y)( lim z) 

(x,y,z)^(2,l,3) (x,y,z)^(2,l,3) (x,y,z)^ (2,1,3) 

— 6( lim x)( lim z) 

(x,y,z)^(2,l,3) (x,y,z)^(2,l,3) 

= (2)(l)(l) + (3)(2)(l)(3)-(6)(2)(3) 
= -16. 

The last three examples are all examples of polynomials in several variables. In general, 
a function / : R n ->■ R of the form 

f{x\, X2, • • • , a^rt) = CIX-^X^ • • • X™ , 



where a is a scalar and i±,t2, ■ ■ ■ ,i n are nonnegative integers, is called a monomial. A 
function which is a sum of monomials is called a polynomial. The following proposition is 
a consequence of the previous examples and (3.1.6), (3.1.7), (3.1.8), and (3.1.10). 
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Proposition If / : R n — > R is a polynomial, then for any point c in M. n , 

lim /(x) = /(c). (3.1.16) 

x— »c 

In other words, / is continuous at every point c in M. n . 

If g and h are both polynomials, then we call the function 

/(x) = fH (3.1.17) 

a rational function. The next proposition is a consequence of the previous theorem and 
(3.1.9). 

Proposition If is a rational function defined at c, then 

lim /(x) = /(c). (3.1.18) 

x— 

In other words, / is continuous at every point c in its domain. 
Example Since 

s x 2 y + 3xyz 2 



Ax 2 + 3z 2 

is a rational function, we have, for example, 

x 2 y + 3xyz 2 4 + 54 



lim f(x,y,z)= lim 



(x,y, 2 )^(2,l,3) ' ' (x,v,z)->(2,l,S) 4x 2 + 3z 2 " 16 + 27 43' 

Example Combining (3.1.18) with (3.1.15), we have 

lim log ( -= 5- J = log ( lim 



(x,y, 2 )^(1,2,1) ^\x 2 +y 2 + Z 2 J \(x,y,z)^ (1,2,1) X 2 + y 2 + Z 2 

-Ml 

= -log(6). 

From the continuity of the square root function and our result above about the conti- 
nuity of polynomials, we may conclude that the function / : IR n — > R defined by 



f(x 1 ,x 2l ■ ■ ■ ,x n ) = ||(xi,x 2 ,. • • ,x n )|| = yjx\ + x% H V x\ 

is a continuous function. This fact is useful in computing some limits, particularly in 
combination with the fact that for any point x = (xi, x 2 , ■ ■ ■ , x n ) in IR n , 



<x 2 + x 2 + --- + xi>\/x 2 = \x k \ (3.1.19) 

for any k = 1, 2, . . . , n. 
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Example Suppose / : R — » R is denned by 



9 

x y 



x 2 + y 2 

Although / is a rational function, we cannot use (3.1.18) to compute 



lim f(x,y) 

since / is not defined at (0,0). However, if we let x = (x,y), then, using (3.1.19), 



\f(x,y)\ 



2 

x y 



x 2 + y 2 



\x\ 2 \y\ \x\ 2 \y\ ||x|| 2 ||x| 



\x 2 + y* 



Now 



so 



Hence 



lim ||x|| = 0, 

(x,y)^(0,0) 



lim I fix, y)\ =0. 



lim f(x, y) = lim 



2 

x y 



(x,y)^(0,0)~ s (x,y)^(0,0) X 2 +y 2 



0. 



See Figure 3.1.6. 
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Recall that for a function <p : R — >■ 



lim (fit) = L 

t^tc 



if and only if both 

lim <p(t) 



and 



lim f(t) = L. 

t— >-c+ 



In particular, if the one-sided limits do not agree, we may conclude that the limit does 
not exist. Similar reasoning may be applied to a function / : IR n — > R, the difference 
being that there are infinitely many different curves along which the variable x might 
approach a given point c in IR n , as opposed to only the two directions of approach in R. 
As a consequence, it is not possible to establish the existence of a limit with this type 
of argument. Nevertheless, finding two ways to approach c which yield different limiting 
values is sufficient to show that the limit does not exist. 

Example Suppose g : R 2 — > R is defined by 

xy 



x 2 + y 2 

If we define a : R 2 -> R by a(t) = (t, 0), then 

lim ait) = lim(£,0) = (0,0) 

and 

lim giait)) = lim fit, 0) = lim — = 0. 

Now a is a parametrization of the x-axis, so the previous limit computation says that 
g(x, y) approaches as (x, y) approaches (0, 0) along the x-axis. However, if we define 
(3 : R — > R 2 by f3(t) = (t, t), then j3 parametrizes the line x = y, 

lim pit) = limit, t) = (0,0), 

and 

t 2 1 

lim gipit)) = lim fit, t) = lim — , = -. 

t^o yv y " t^o JK ' t^o2t 2 2 

Hence g(x, y) approaches ^ as (x, y) approaches (0, 0) along the line x = y. Since these two 
limits are different, we may conclude that g(x, y) does not have a limit as (x, y) approaches 
(0, 0). Note that g in this example and / in the previous example are very similar functions, 
although our limit calculations show that their behavior around (0, 0) differs significantly. 
In particular, / has a limit as (x,y) approaches (0,0), whereas g does not. This may be 
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Figure 3.1.7 Graph of g(x, y) 



xy 



x 2 + y 2 



seen by comparing the graph of g in Figure 3.1.7, which has a tear at the origin, with that 
of / in Figure 3.1.6. 

The next proposition lists some basic properties of continuous functions, all of which 
follow immediately from the similar list of properties of limits. 

Proposition Suppose / : IR n — > R and g : IR n — > R are both continuous at c. Then the 
functions with values at x given by 



/(x) + <7(x), 


(3.1.20) 


/(x)-<7(x), 


(3.1.21) 


/(x)</(x), 


(3.1.22) 


/(x) 
2(x) 


(3.1.23) 


*/(x), 


(3.1.24) 



(provided g(c) ^ 0), and 

where k is any scalar, are all continuous at c. 

From the result above about the limit of a composition of two functions, we have the 
following proposition. 

Proposition If / : IR n — > R is continuous at c and (p : R — > R is continuous at /(c), 
then (p o / is continuous at c. 
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Example Since the function (p(t) = sin(t) is continuous for all t and the function 

f(x,y,z) = a/x 2 + y 2 + z 2 
is continuous at all points (x,y,z) in IR 3 , the function 



g(x, y, z) = sin(^/x 2 + y 2 + z 2 ) 

is continuous at all points (x, y, z) in IR 3 . 
Example Since the function 

h(x,y) = sm(^/x 2 + y 2 ) 

is continuous for all (x, y) in IR 2 (same argument as in the previous example) and the 
function 

g(x,y) = a/x 2 + y 2 
is continuous for all (x,y) in IR 2 , the function 

sin^x 2 +y 2 ) 

= / r ~2~T 2 

is, using (3.1.23), continuous at every point (x,y) ^ (0,0) in IR 2 . Moreover, if we let 
x = (x, y), then 

lim f(x,y)= lim = lim sin(||x||) = ^ sin(r) = ^ 

(x,y)^(0,0) ' (x,y)^(0,0) y/x 2 + y 2 (x,y)^(0,0) ||x|| r^O r 

Hence the discontinuity at (0, 0) is removable. That is, if we define 

Mgm , if (x,,)^ (o,o), 

1, if (x,y) = (0,0), 



then is continuous for all (x,y) in 



)2 



Open and closed sets 

In single- variable calculus we talk about a function being continuous not just at a point, 
but on an open interval, meaning that the function is continuous at every point in the 
open interval. Similarly, we need to generalize the definition of continuity of a function 
/ : IR n — > IR from that of continuity at a point in IR n to the idea of a function being 
continuous on a set in IR n . Now the condition for a function / to be continuous at a point 
c requires that / be defined on some open ball containing c. Hence, in order to say that / 
is continuous at every point in some set U, it is necessary that, given any point u in U, f 
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be denned on some open ball containing u. This provides the motivation for the following 
definition. 

Definition We say a set of points U in IR n is open if whenever u is a point in U, there 
exists a real number r > such that the open ball B n (u, r) lies entirely within U. We say 
a set of points C in R n is closed if the set of all points in M. n which do not lie in C form 
an open set. 

Example WL n is itself an open set. 

Example Any open ball in R n is an open set. In particular, any open interval in R is an 
open set. To see why, consider an open ball S n (a, r) in R n . Given a point y in B n (a, r), 
let s be the smaller of ||y — a|| (the distance from y to the center of the ball) and r— ||y — a|| 
(the distance from y to the edge of the ball). Then B n (y,s) is an open ball which lies 
entirely within £? n (a, r). Hence £? n (a, r) is an open set. 

Example Any closed ball in R n is a closed set. In particular, any closed interval in R is 
a closed set. To see why, consider a closed ball S n (a, r). Given a point y not in B n (a, r), 
let s = ||y — a|| — r, the distance from y to the edge of £? n (a, r). Then B n (y, s) is an open 
ball which lies entirely outside of £? n (x, r). Hence £? n (x, r) is a closed set. 

Example Given real numbers d\ < &i, < b^-, ■ ■ ■ , a n < b n , we call the set 

U = {(xi,X2, ■ ■ ■ ,x n ) : a,i < Xi < bi,i = 1,2, . . . ,n} 

an open rectangle in R n and the set 

C = {(xi,X2, ■ ■ ■ ,x n ) : at < Xi < bi,i = 1, 2, . . . ,n} 

a closed rectangle in M. n . An argument similar to that in the previous example shows that 
U is an open set and C is a closed set. 

Definition We say a function / : R n — > R is continuous on an open set U if / is 
continuous at every point u in U. 

Example The function 

3xyz — 6x 



f(x,y,z) = 9 

x 2 + y 2 + z 2 + 1 



is continuous on R . 



Example The functions 

f(x,y) = 

and 

g(x,y) = 



9 

x y 




if (x,y)^(0,0), 
if (x,y) = (0,0), 

if (x,y)^(0,0), 



if (x,y) = (0,0), 
are, from our work in previous examples, continuous on R 2 . 
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Example The function 

/ x xy 
9{x,y) = 



x 2 + y 2 

is continuous on the open set 

U = {(x,y):(x,y)^(0,0)}. 

Note that in this case it is not possible to define g at (0, 0) in such a way that the resulting 
function is continuous at (0, 0), a consequence of our work above showing that g does not 
have a limit as (x,y) approaches (0,0). 

Example The function 

f(x,y) = log(xy) 

is continuous on the open set 

U = {(x, y) : x > and y > 0}. 

Problems 

1. Plot the graph and a contour plot for each of the following functions. Do your plots 
over regions large enough to illustrate the behavior of the function. 



(a) f(x,y) 


= x 2 + Ay 2 


(b) f(x,y) 


= x 2 - y 2 


(c) f(x,y) 


= Ay 2 - 2x 2 


(d) h(x,y) 


= sin(x) cos(y) 


(e) f(x,y) 


= sin(x + y) 


(f) g(x,y) 


= sin(x 2 + y 2 ) 


(g) g(x,y) 


= sin(x 2 — y 2 ) 


(h) h(x,y) 


= xe~^ x2+y2 


0) f(x,y) 




0) f(x,y) 


= sin(7r sin(x) + y) 


(k) h(x,y) 


_ sin(a: 2 + y 2 ) 
x 2 + y 2 


(1) g(x,y) 


= logi^/x 2 + y 2 ) 



2. For each of the following, plot the contour surface f(x, y,z) = c for the specified value 
of c. 

(a) f(x, y,z)=x 2 + y 2 + z 2 ,c = 4 (b) f(x, y, z) = x 2 + Ay 2 + 2z 2 , c = 7 

(c) f(x, y, z) = x 2 + y 2 - z 2 , c = 1 (d) f(x, y, z) = x 2 - y 2 + z 2 , c = 1 

3. Evaluate the following limits, 
(a) lim {"Sxy + x 2 y + Ay) (b) lim 



(x,y)^(2,l) ' ' 1 • (x,y,z)^(l,2,l) 2xy 2 + Az 

lim c osily) Um 2 X -Sy +Z 

(x,y)^(2,0) Vx 2 + 1 (x,y, 2 )^(2,l,3) 
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4. For each of the following, either find the specified limit or explain why the limit does 
not exist. 

xy 2 . x 



(a) lim -s- — ^ (b) lim 

(x, y )^(o,o) x 2 + y 2 (x, y )-Ko,o) a; + y 

(c) lim - (d) lim 



(x,y)^(0,0) X + y 2 v ' (x,y)^(0,0) ^2 + y 2 

l_ e -0 2 +y 2 ) x 4 -y 4 

(e) lim - - — (f) lim — - 

(*,y)-^(0,0) X 2 + (as,»)->(0,0) X 2 + y 2 

x 2 

5. Let f(x,y) = 



x 4 + Ay 2 ' 

(a) Define a : R ->• R 2 by a(£) = (t, 0). Show that lim f(a(t)) = 0. 

t— >-o 

(b) Define : R -> R 2 by = (0,£). Show that lim /(£(*)) = 0. 

(c) Show that for any real number m, if we define 7 : R — > R 2 by 7(f) = (t, mt), then 
lim /(7(f)) = 0. * 

(d) Define 6 : R ->■ M 2 by = (t,t 2 ). Show that lim /(<*(*)) = -. 

2 

(e) What can you conclude about lim — ; — T ? 

(x,y)^(0,0) X 4 + V 

(f) Plot the graph of / and explain your results in terms of the graph. 
6. Discuss the continuity of the function 

1. if (x,y) = (0,0). 



7. Discuss the continuity of the function 



''' V , if Or, y)^ (0,0), 



g(x,y)={ x 4 + V 4 

1, if (x,y) = (0,0). 

8. For each of the following, decide whether the given set is open, closed, neither open 
nor closed, or both open and closed. 

(a) (3, 10) in R 

(b) [-2, 5] in R 

(c) {(x,y) :x 2 + y 2 <4} in R 2 

(d) {(x,y) :x 2 + y 2 >4} in R 2 
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(e) {(x,y) :x 2 + y 2 <4} in R 2 

(f) {(x,y) :x 2 + y 2 =4} in R 2 

(g) {(x, y, z) : -1< x < 1, -2 < y < 3, 2 < z < 5} in M 3 

(h) {(x, : -3 < x < 4, -2 < y < 1} in R 2 

9. Give an example of a subset of R which is neither open nor closed. 
10. Is it possible for a subset of R 2 to be both open and closed? Explain. 
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Directional Derivatives and 
the Gradient 



For a function cp : R — > R, the derivative at a point c, that is, 

V? (c) = hm , (3.2.1) 

ri—i-Q II 

is the slope of the best affine approximation to if at c. We may also regard it as the slope 
of the graph of if at (c, <p(c)), or as the instantaneous rate of change of (p(x) with respect 
to x when x = c. As a prelude to finding the best affine approximations for a function 
/ : R n — > R, we will first discuss how to generalize (3.2.1) to this setting using the ideas of 
slopes and rates of change for our motivation. 

Directional derivatives 

Example Consider the function / : R 2 — > R defined by 

/(x,y) = 4-2x 2 -y 2 , 

the graph of which is pictured in Figure 3.2.1. If we imagine a bug moving along this 
surface, then the slope of the path encountered by the bug will depend both on the bug's 
position and the direction in which it is moving. For example, if the bug is above the point 
(1,1) in the xy-plane, moving in the direction of the vector v = (—1,-1) will cause it to 
head directly towards the top of the graph, and thus have a steep rate of ascent, whereas 
moving in the direction of — v = (1, 1) would cause it to descend at a fast rate. These two 
possibilities are illustrated by the red curve on the surface in Figure 3.2.1. For another 
example, heading around the surface above the ellipse 

2x 2 + y 2 = 3 

in the xy-plane, which from (1, 1) means heading initially in the direction of the vector 
w = (—1, 2), would lead the bug around the side of the hill with no change in elevation, 
and hence a slope of 0. This possibility is illustrated by the green curve on the surface in 
Figure 3.2.1. Thus in order to talk about the slope of the graph of / at a point, we must 
specify a direction as well. For example, suppose the bug moves in the direction of v. If 
we let 

u = -^(M), 

the direction of v, then, letting c = (1, 1), 

/(c + /m)-/(c) 
h 
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Figure 3.2.1 Graph of f(x) = 4 - 2x 2 - y 2 

would, for any h > 0, represent an approximation to the slope of the graph of / at (1,1) 
in the direction of u. As in single-variable calculus, we should expect that taking the limit 
as h approaches should give us the exact slope at (1,1) in the direction of u. Now 



/(c + /m)-/(c) = /(l- A i 



h 

71 



2 1 



h 

3-3^1-V2/i + 



h 

'72 



so 



lim 



/(c + feu) -/(c) 
h 



lim ( 3\/2- — ) = 3^/2 
fe^o \ 2 y 



Hence the graph of / has a slope of 3\/2 if we start above (1, 1) and head in the direction 
of u; similar computations would show that the slope in the direction of — u is — 3y^2 and 
the slope in the direction of 
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w i 

(-1,2) 



|| W || y/5 

is 0. 

Definition Suppose / : M. n — > K is defined on an open ball about a point c. Given a 
unit vector u, we call 

Du /(c)=l im / (C+ ' t " ) - /(c) , (3.2.2) 

provided the limit exists, the directional derivative of / in the direction of u at c. 
Example From our work above, if f(x, y) = 4 — 2x 2 — y 2 and 

« = -^(M), 

then D u f (1,1) = 3V2. 

Directional derivatives in the direction of the standard basis vectors will be of special 
importance. 

Definition Suppose / : R n — > M is defined on an open ball about a point c. If we 
consider / as a function of x = (xi,x 2 , . . . ,x n ) and let e^ be the kth standard basis 
vector, k = 1,2, . . . , n, then we call D Gk f(c), if it exists, the partial derivative of / with 
respect to Xk at c. 

Notations for the partial derivative of / with respect to Xk at an arbitrary point 
x = [x 1 ,x 2 , ■ ■ ■ ,x n ) include D Xk f(x 1 , X2 , • • • , X n 

), fx k (xi,x 2 , ■ ■ .,x n ), and 
9 -f{xi,x 2 , ■ ■ .,x n ). 



dx k 

Now suppose / : I" — > K and, for fixed x = (xi, x 2 , ■ ■ ■ , x n ), define g : K — > M. by 

g(t) = f(t,x 2 , . . .,x n ). 



Then 



f U r v \ - l im /((^l,^2, ...,X n )+ fegl) - f(x U X 2 , ...,!„) 

Ja;iV x l5 X 2) • • • > X n ) — mil 

h->0 II 

= Um /((xi,x 2 , ■ ■ ■ + (fc,0, . . . ,0)) - f(x 1 ,x 2 , ...,x n ) 
_ j. /(xi + h,x 2 ,..., Xn) - f{xi,x 2 , ...,x n ) (3.2.3) 



/i->0 /l 

Um 9{xi + h) - g(xi) 
/i->0 /i 

= g'(x l ). 
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In other words, we may compute the partial derivative f Xl (x±, x%, . . . , x n ) by treating 
X2,X3, . . . ,x n as constants and differentiating with respect to X\ as we would in single- 
variable calculus. The same statement holds for any coordinate: To find the partial 
derivative with respect to Xk, treat the other coordinates as constants and differentiate 
as if the function depended only on Xk- 

Example If / : M 2 ->• R is defined by 

f(x,y) = 3x 2 - 4xy 2 , 



then, treating y as a constant and differentiating with respect to x, 

fx(x,y) = 6x- 4y 2 
and, treating x as a constant and differentiating with respect to y, 



Example If / : M — >• M is defined by 

f(w, x, y,z) = - log(u; 2 + x 2 + y 2 + z 2 ), 



then 



and 



d_ 
dw 

8_ 

dx 

8_ 
dy 

8_ 

dz 



f(w,z,y,z) = 
f(w,z,y,z) = 
f(w,z,y,z) = 

f(w,z,y,z) = 



2w 



w 2 


+ x 2 + y 2 


+ z 2 




2x 




w 2 


+ x 2 + y 2 


+ z 2 








w 2 


+ x 2 + y 2 


+ z 2 




2z 




w 2 


+ x 2 + y 2 


+ z 2 



Example Suppose g : M — > K is defined by 

' _ xy 

g{x,y) = < 



, if (x,y)^(0,0), 

x z + y z 



LO, if (x,y) = (0,0). 

We saw in Section 3.1 that lim g(x, y) does not exist; in particular, g is not continuous 

(x,j/)-Ko,o) 

at (0, 0). However, 

a _ Um g(( o. o) + Mi o))- .(o.o) _ Um 9(M) _ lim o „ 

ox /i— s-o h h^-o h h^o h 
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and 



* (0,0) = lim »«"■") + MO- D)-»(0.0) = lim 9jm = lim 



This shows that it is possible for a function to have partial derivatives at a point without 
being continuous at that point. However, we shall see in Section 3.3 that this function is 
not differentiable at (0,0); that is, / does not have a best affine approximation at (0,0). 

The gradient 

Definition Suppose / : R n — > R is defined on an open ball containing the point c and 
d 

— — f (c) exists for k = 1, 2, . . . , n. We call the vector 

v/ M^ /(c) '^ /(c) '--^ /(c) ) (3 ' 2 ' 4) 

the gradient of / at c. 

Example If / : R 2 -> R is defined by 

f(x,y) = 3x 2 - 4xy 2 , 

then 

Vf(x,y) = (6x-4y 2 ,-8xy). 
Thus, for example, V/(2, -1) = (8, 16). 
Example If / : R 4 -> R is defined by 

f(w, x, y,z) = - log(w 2 + x 2 + y 2 + z 2 ), 

then 

Vf(w.x.v.z) = - , 

w 2 + x z + y 2 + z 



Vf(w,x,y,z) = - — 2 (w,x,y,z). 



Thus, for example, 

V/(l,2,2,l) = -^(1,2,2,1). 

Notice that if / : R n ->■ R, then V/ : R™ -> R n ; that is, we may view the gradient as a 
function which takes an n-dimensional vector for input and returns another n-dimensional 
vector. We call a function of this type a vector field. 



Definition We say a function / : R n — > R is C 1 on an open set U if / is continuous on 

df 

dx k 



U and, for k = 1, 2, . . . , n, yJ- is continuous on U. 
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Now suppose / : M 2 — > R is C 1 on some open ball containing the point c = (ci, c 2 ). Let 
u = (u\,U2) be a unit vector and suppose we wish to compute the directional derivative 
D u f(c). From the definition, we have 

Du/(c) = lim + 

/(ci + /i«i,c 2 + hu 2 ) - /(ci,c 2 ) 

= nm 

/i->o n, 

_ lim f(ci + toi,c 2 + hu 2 ) - fjci + hui,c 2 ) + /(ci + hu 1 ,c 2 ) - /(ci,c 2 ) 
]hi] , 7(ci + foti,c 2 + /xn 2 ) - f{cj + hui,c 2 ) | /(ci + hu 1 ,c 2 ) - /(ci,c 2 ) 



For a fixed value oi h ^ 0, define 93 : R — > R by 

¥?(*) = /(ci + /wii,C2+t). (3.2.5) 
Note that is differentiable with 

ip'it) = lim ^ + S) ~ ^ 

s^O S 

_ lim /(ci + /mi,c 2 + t + 5) - /(ci + tei,c 2 + t) ^ 2 ^ 

s->-0 S 

d 

= Trf( c i +hui,c 2 +t). 
oy 

Hence if we define a : R — > R by 

= (f(u 2 t) = /(ci + hui,c 2 + tu 2 ), (3.2.7) 

then a is differentiable with 

c/(t) = u 2 (p'(u 2 t) = u 2 —f(ci + hui,c 2 + £m 2 ). (3.2.8) 

By the Mean Value Theorem from single- variable calculus, there exists a number a between 
and h such that 

a(h)-a(0) =a , {a) (329) 

Putting (3.2.7) and (3.2.8) into (3.2.9), we have 

f(c 1 +hu 1 ,c 2 + hu 2 ) - f(ci + hu 2 ,c 2 ) d . . 
^ =u 2 —f(c 1 + hu 1 ,c 2 + au 2 ). (3.2.10) 

Similarly, if we define (3 : R -> R by 

P(t) = f(c 1 + tu 1 ,c 2 ), (3.2.11) 
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then (3 is differentiable, 

P , (t) = u 1 -^f(c 1 +tu 1 ,c 2 ), (3.2.12) 

and, using the Mean Value Theorem again, there exists a number b between and h such 
that 

f^+hm,c,)-f( Cl , C2 ) = m-m = m = d f{ci + bui ^ y (3 2 13) 

lb IV CJ JL 

Putting (3.2.10) and (3.2.13) into our expression for D u f(c) above, we have 

( d d \ 

D u f{c) = lim u 2 — f(c! + hui,c 2 + au 2 ) +u 1 —f(c 1 + bui,c 2 ) ■ (3.2.14) 
k->o \ oy ox J 

Now both a and b approach as h approaches and both || and ^ are assumed to be 
continuous, so evaluating the limit in (3.2.14) gives us 

D u f(c) = u 2 — f( Cl ,c 2 ) + Ul — /( Cl ,c 2 ) = V/(c) • u. (3.2.15) 

A straightforward generalization of (3.2.15) to the case of a function / : 1" — > M gives 
us the following theorem. 

Theorem Suppose / : R n — > M. is C 1 on an open ball containing the point c. Then for 
any unit vector u, D u f(c) exists and 

AJ(c) = V/(c) • u. (3.2.16) 
Example If / : M 2 -> R is defined by 



then 
If 

then 



f(x,y) = 4-2x 2 -y 2 , 



V/(x,y) = (-4x,-2y). 



D u f(l, 1) = V/(l, 1) • u = (-4, -2) • (-^(1, 1)) = ^= = 3>/2, 
as we saw in this first example of this section. Note also that 

D_ u f(l, 1) = V/(l, 1) • (-u) = (-4, -2) • ^(1, l)j=-^= = -3V2 
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and, if 

w=^(-l,2), 
L> w /(1, 1) = V/(l, 1) • (w) = (-4, -2) • (^=(-1, 2) ) = 0, 



as claimed earlier. 

Example Suppose the temperature at a point in a metal cube is given by 

T(x,y,z) = 80- 20xe-^> (x2+y2+z2) , 
where the center of the cube is taken to be at (0, 0, 0). Then we have 

^-T(x,y,z) = 2x 2 e-^ 2 +y 2 +* 2 ) - 20 e -^ 2 +v 2 +^), 
ox 

^T(x,y,z) = 2xye-^ 2+ y 2+z2 \ 
dy 

and 

■?-T(x,y,z) = 2xze~^ x2+y2+z2 \ 
oz 

so 

VT(x,y,z) = e -^ {x2+y2+z2 \2x 2 - 20,2xy,2xz). 

Hence, for example, the rate of change of temperature at the origin in the direction of the 
unit vector 

->.-i.D 

is 

D u T(0, 0, 0) = VT(0, 0, 0) • u = (-20, 0, 0) • (-^=(1, -1,1)^ 

An application of the Cauchy-Schwarz inequality to (3.2.16) shows us that 

|AJ(c)| = |V/(c) • u| < ||V/(c)||||u|| = ||V/(c)||. (3.2.17) 

Thus the magnitude of the rate of change of / in any direction at a given point never 
exceeds the length of the gradient vector at that point. Moreover, in our discussion of 
the Cauchy-Schwarz inequality we saw that we have equality in (3.2.17) if and only if u is 
parallel to V/(c). Indeed, supposing V/(c) ^ 0, when 

u- v/ < c > 



iiv/(c)ir 
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we have 
and 

D_ u /(c) = -||V/(c)||. (3.2.19) 
Hence we have the following result. 

Proposition Suppose / : M. n — > R is C 1 on an open ball containing the point c. Then 
D u f(c) has a maximum value of || V/(c)|| when u is the direction of V/(c) and a minimum 
value of — ||V/(c)|| when u is the direction of — V/(c). 

In other words, the gradient vector points in the direction of the maximum rate of 
increase of the function and the negative of the gradient vector points in the direction of 
the maximum rate of decrease of the function. Moreover, the length of the gradient vector 
tells us the rate of increase in the direction of maximum increase and its negative tells us 
the rate of decrease in the direction of maximum decrease. 

Example As we saw above, if / : R 2 — > R is defined by 

f(x,y) = 4-2x 2 -y 2 , 

then 

V/(x,y,) = (-4x,-2y). 

Thus V/(l, 1) = (—4, —2). Hence if a bug standing above (1, 1) on the graph of / wants 
to head in the direction of most rapid ascent, it should move in the direction of the unit 
vector 

V/(l,l) 1 

u " ||v/(i,i)|| -"7f (2 ' 1} - 

If the bug wants to head in the direction of most rapid descent, it should move in the 
direction of the unit vector 

Moreover, 

L> u /(l,l) = ||V/(l,l)|| = v / 20 

and 

D_ U /(1,1) = -||V/(1,1)|| =-v / 20. 

Figure 3.2.2 shows scaled values of V/(x, y) plotted for a grid of points (x,y). The vec- 
tors are scaled so that they fit in the plot, without overlap, yet still show their relative 
magnitudes. This is another good geometric way to view the behavior of the function. 
Supposing our bug were placed on the side of the graph above (1,1) and that it headed up 
the hill in such a manner that it always chose the direction of steepest ascent, we can see 
that it would head more quickly toward the y-axis than toward the x-axis. More explicitly, 
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Figure 3.2.2 Scaled gradient vectors for f(x, y) = 4 — 2x 2 — y 2 

if C is the shadow of the path of the bug in the sy-plane, then the slope of C at any point 
(x, y) would be 

dy -2y _ y_ 
dx —Ax 2x 

Hence 

ldy = JL_ 
y dx 2x 

If we integrate both sides of this equality, we have 

[ — ^- dx = [ — dx. 
J ydx J 2x 

Thus 

lo g \y\ = 2 lo s M + c 

for some constant c, from which we have 

glog \y\ _ g 5 log \x\+c^ 

It follows that 

y = ky/\x\, 

where k = ±e c . Since y = 1 when x = 1, k = 1 and we see that C is the graph of y = ^fx. 
Figure 3.2.2 shows C along with the plot of the gradient vectors of /, while Figure 3.2.3 
shows the actual path of the bug on the graph of /. 
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y 




2 x 

Figure 3.2.3 Graph f(x, y) = 4 — 2x 2 — y 2 with path of most rapid ascent from (1, 1, 1) 

Example For a two-dimensional version of the temperature example discussed above, 
consider a metal plate heated so that its temperature at (x, y) is given by 

T(x,y) = 80- 20xe-^ (a;2+y2) . 

Then 

VT(x,y) = e-^ {x2+y2) (2x 2 - 20,2xy), 

so, for example, 

VT(0,0) = (-20,0). 

Thus at the origin the temperature is increasing most rapidly in the direction of u = (—1,0) 
and decreasing most rapidly in the direction of (1,0). Moreover, 

D u r(0,0) = ||V/(0 ) 0)|| = 20 

and 

L>_ U T(0,0) = -||V/(0,0)||=20. 

Note that 

D_ u T(0,0) = ^T(0,0) 

and 

AiT(0,0) = -^T(0,0). 
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Figure 3.2.4 Scaled gradient vectors for T(x,y) = 80 — 20xe 20 ( x +y ) 



Figure 3.2.4 is a plot of scaled gradient vectors for this temperature function. From the 
plot it is easy to see which direction a bug placed on this metal plate would have to choose 
in order to warm up as rapidly as possible. It should also be clear that the temperature 
has a relative maximum around (—3, 0) and a relative minimum around (3, 0); these points 
are, in fact, exactly (— \/i0, 0) and (VTO, 0), the points where VT(i, y) = (0,0). We will 
consider the problem of finding maximum and minimum values of functions of more than 
one variable in Section 3.5. 

Problems 

1. Suppose / : M 2 — > E is defined by 

f(x,y) = 3x 2 + 2y 2 . 

Let 

Find _D u /(3, 1) directly from the definition (3.2.2). 

2. For each of the following functions, find the partial derivatives with respect to each 
variable. 
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4x 2 

(a) f(x,y) = 2 2 (b) g(x,y) = 4xy 2 e"f 

x ~\~ y 

(c) f(x,y,z) = 3x 2 y 3 z 4 — 13x 2 y (d) /i(x, y, z) = Axze * 2 +v 2 +z 2 

(e) g(w, x, y, z) = sin(^/«; 2 + x 2 + 2y 2 + 32 2 ) 
3. Find the gradient of each of the following functions, 

(a) f(x, y, z) = ^Jx 2 + y 2 + z 2 (b) g(x,y,z) 



\Jx 2 + y 2 + z 2 
(c) /(«;, x, y, 2) = tan _1 (4u> + 3x + 5y + 2) 

4. Find D u f(c) for each of the following. 

(a) /(x, y) = 3x 2 + 5y 2 , u = -L(3, -2), c = (-2, 1) 

V 13 



(b) f(x,y) = x 2 - 2y 2 , u = -^=(-1,2), c = (-2,3) 

(c) f{x,y,z) = 1 u= -4(1.2,1), c = (-2,2,1) 

+ y^ + v 6 

5. For each of the following, find the directional derivative of / at the point c in the 
direction of the specified vector w. 

(a) f(x, y) = 3x 2 y, w = (2, 3), c = (-2, 1) 

(b) f(x, y, z) = log(x 2 + 2y 2 + z 2 ), w = (-1, 2, 3), c = (2, 1, 1) 

(c) fit, x, y, z) = tx 2 yz 2 , w = (1, -1, 2, 3), c = (2, 1, -1, 2) 

6. A metal plate is heated so that its temperature at a point (x, y) is 

T(x,y) = 50y 2 e-^ x2+y2) . 
A bug is placed at the point (2, 1). 

(a) The bug heads toward the point (1,-2). What is the rate of change of temperature 
in this direction? 

(b) In what direction should the bug head in order to warm up at the fastest rate? 
What is the rate of change of temperature in this direction? 

(c) In what direction should the bug head in order to cool off at the fastest rate? What 
is the rate of change of temperature in this direction? 

(d) Make a plot of the gradient vectors and discuss what it tells you about the tem- 
peratures on the plate. 

7. A heat-seeking bug is a bug that always moves in the direction of the greatest increase 
in heat. Discuss the behavior of a heat seeking bug placed on a metal plate heated so 
that the temperature at (x, y) is given by 

T(x,y) = 100 - 40xye-w( a;2 +f 2 ). 
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8. Suppose g : R 2 — > R is denned by 



in-*-.//; / (0,0), 



s(x,y) = <^ x + W 

[o, if (x,y) = (0,0). 

We saw above that both partial derivatives of g exist at (0,0), although g is not 
continuous at (0,0). 

(a) Show that neither g| nor g| is continuous at (0,0). 

(b) Let 

Show that D u g(0, 0) does not exist. In particular, D u g(0,0) ^ Vg(0,0) ■ u. 

9. Suppose the price of a certain commodity, call it commodity A, is x dollars per unit 
and the price of another commodity, B, is y dollars per unit. Moreover, suppose that 
d,A(x,y) represents the number of units of A that will be sold at these prices and 
dB{x,y) represents the number of units of B that will be sold at these prices. These 
functions are known as the demand functions for A and B. 

(a) Explain why it is reasonable to assume that 

d 

—d A (x,y) < 

and 

d 

—d B (x,y) < 
oy 

for all (x,y). 

(b) Suppose the two commodities are competitive. For example, they might be two 
different brands of the same product. In this case, what would be reasonable 
assumptions for the signs of 

Qyd A {x,y) 



and 



^d B (x,y)? 



(c) Suppose the two commodities complement each other. For example, commodity 
A might be a computer and commodity B a type of software. In this case, what 
would be reasonable assumptions for the signs of 

^d A (x,y) 

and 

^d B (x,y)7 
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10. Suppose P(x±, X2, • • • , x n ) represents the total production per week of a certain factory 
as a function of x±, the number of workers, and other variables, such as the size of 
the supply inventory, the number of hours the assembly lines run per week, and so on. 
Show that average productivity 

P(xi,x 2 , ...,x n ) 

Xl 

increases as ii increases if and only if 

d P(x 1 ,x 2 , ■ ■ .,x n ) 

-P(x 1 ,x 2 , ...,x n )> 



OX\ X\ 

11. Suppose / : W 1 — > K is C 1 on an open ball about the point c. 

(a) Given a unit vector u, what is the relationship between D u f(c) and D_ u /(c)? 

(b) Is it possible that D u f(c) > for every unit vector u? 



The Calculus of Functions 
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Best Affine Approximations 



Best affine approximations 

Given a function / : IR n — > R and a point c, we wish to find the affine function A : M. n — > R 
which best approximates / for points close to c. As before, best will mean that the 
remainder function, 

i*(h)=/(c + h)-A(c + h), (3.3.1) 

approaches at a sufficiently fast rate. In this context, since -R(h) is a scalar and h is a 
vector, sufficiently fast will mean that 

lim = 0. (3.3.2) 

h^O ||h|| 

Generalizing our previous notation, we will say that a function R : M. n — > R satisfying 
(3.3.2) is o(h). Note that if n = 1 this extended definition of o(h) is equivalent to the 
definition given in Section 2.2. 

Definition Suppose / : IR n — y R is defined on an open ball containing the point c. We 
call an affine function A : M. n — > R the best affine approximation to / at c if (1) A(c) = /(c) 
and (2) R(h) is o(h), where 

R(h) = /(c + h) - A(c + h). (3.3.3) 

Suppose / : M n —> R and suppose A : M n — > R is the best affine approximation to / at 
c. Since A is affine, there exists a linear function L : R n — > R and a scalar 6 such that 

A(x)=L(x) + 6 (3.3.4) 

for all x in IR n . Since A(c) = /(c), we have 

/(c) = L(c)+6, (3.3.5) 

which implies that 

6 = /(c)-L(c). (3.3.6) 

Hence 

A(x) = L(x) + /(c) - L(c) = L(x - c) + /(c) (3.3.7) 
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for all x in IR n . Moreover, if we let 

a=(L( ei ),L(e 2 ),...,L(e n )), (3.3.8) 

where e 1? e 2 , . . . , e n are, as usual, the standard basis vectors for M n , then, from our results 
in Section 1.5, 

L(x)=a-x (3.3.9) 

for all x in M n . Hence 

A(x) = a- (x-c) + /(c), (3.3.10) 

for all x in l n , and we see that A is completely determined by the vector a 

Definition Suppose / : R n — > M. is defined on an open ball containing the point c. If / 
has a best affine approximation at c, then we say / is differentiable at c. Moreover, if the 
best affine approximation to / at c is given by 

A(x) = a-(x-c) + /(c), (3.3.11) 

then we call a the derivative of / at c and write Df(c) = a. 

Now suppose / : lR n — > M. is differentiable at c with best affine approximation A and 
let a = (ai, a2, • • • , a n ) = Df(c). Since 

R(h) = /(c + h) - A(c + h) = /(c + h) - a • h - /(c) (3.3.12) 

is o(h), we must have 

lim ^ = 0. (3.3.13) 

h^O ||h|| 

In particular, for k = 1, 2, . . . , n, if we let h = te k: then h approaches as t approaches 0, 
so 

= Km R< " tek ^ = Km f( c + te k) -t(&-e k ) -/(c) = f(c + te k ) - ta k - /(c) 
t^o \\te k \\ t^o \t\ t^o \t\ 

First considering t > 0, we have 

= Um fegWW = lim //(e+jgtWW _ \ _ (3 .3. 14) 



t^o+ t t^o+ \ t 

implying that 



a k = lim /(c + fefc) - /(c) . (3.3.15) 

t^o+ t 



With t < 0, we have 



0= lim /(° + **)-*°*-/(<0 = _ i im ( 7(c + *e fc )-/(c) _ . (3 _3_ 16) 
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implying that 



Hence 



a k = lim /(C + fefc) - /(c) . (3.3.17) 
t->o- t 

/(c + tefc) - /(c) a 

a fc = lim = —/(c). (3.3.18) 

Thus we have shown that 

a K^ /(c) '^ /(c) "-s: /(c) )^ v/(c) - <M ' 19) 

Theorem If / : R n ->■ R is differentiable at c, then 

Df(c) = V/(c) (3.3.20) 

It now follows that if / : R n — > R is differentiable at c, then the best affine approxi- 
mation to / at c is 

A(x) = V/(c)-(x-c)-/(c). (3.3.21) 

However, the converse does not hold: it is possible for V/(c) to exist even when / is not 

differentiable at c. Before looking at an example, note that if / is differentiable at c and 
A is the best affine approximation to / at c, then, since R(h) = /(c + h) — A(c + h) is 
o(h), 

lim (/(c + h) - A(c + h)) = lim llhll = 0||0|| = 0. (3.3.22) 

h— >-0 h— »0 n 

Now A is continuous at c, so it follows that 

lim /(c + h) = lim A(c + h) = A(c) = /(c). (3.3.23) 

h^O h^O 

In other words, / is continuous at c. 

Theorem If / : R n — > R is differentiable at c, then / is continuous at c. 
Example Consider the function 

XV : , if (x, y)? (0,0), 



g(x,y) = { x 2 + V 2 

0, if (x, y) = (0,0). 



In Section 3.1 we showed that g is not continuous at (0, 0) and in Section 3.2 we saw that 
V<7(0, 0) = (0,0). Since g is not continuous at (0,0), it now follows, from the previous 
theorem, that g is not differentiable at (0,0), even though the gradient exists at that 
point. From the graph of g in Figure 3.3.1 (originally seen in Figure 3.1.7), we can see 
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Figure 3.3.1 The graph of a nondifferentiable function 



that the fact that g is not differentiable, in fact, not even continuous, at the origin shows 
up geometrically as a tear in the surface. 

From this example we see that the differentiability of a function / : IR n — > R at a 
point c requires more than just the existence of the gradient of / at c. It turns out that 
continuity of the partial derivatives of / on an open ball containing c suffices to show that 
/ is differentiable at c. Note that the partial derivatives of g in the previous example are 
not continuous (see Problem 8 of Section 3.2). 

So we will now assume that / : IR n — > R is C 1 on some open ball containing c. If we 
define an affine function A : M. n — >■ R by 

A(x) = V/(c)-(x-c) + /(c), (3.3.24) 

then the remainder function is 

R(h) = /(c + h) - A(c + h) = /(c + h) - /(c) - V/(c) • h. (3.3.25) 

We need to show that R(h) is o(h). Toward that end, for a fixed h ^ 0, define (p : R — > R 
by 

<p(t) = f(c + th). (3.3.26) 
We first note that tp is differentiable with 

</(t) = lim ^ + S) ~ ^ 

s 

= lim f(c + (t + s)h)-f(c + th) 

S 
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f(c + th + s\\h\\^-f(c + th) 



= Ilhll lim 

s^o s||h| 

= llhim h f(c + th) 

l|h|| 

= ||h|| ( V/(c + £h) " 



Jh| 

= V/(c + th)-h (3.3.27) 

From the Mean Value Theorem of single-variable calculus, it follows that there exists a 
number s between and 1 such that 

<p'(s) = <p(l) - ^(0) = /(c + h) - /(c). (3.3.28) 

Hence we may write 

R(h) = V/(c + sh) h - V/(c) • h = (V/(c + sh) - V/(c)) • h. (3.3.29) 
Applying the Cauchy-Schwarz inequality to (3.3.29), 

\R(h)\ < ||V/(c + sh)-V/(c)||||h||, (3.3.30) 

and so 

!^l<||V/(c + a h)-V/(c)||. (3.3.31) 
Now the partial derivatives of / are continuous, so 

lim ||V/(c + sh) - V/(c) || = ||V/(c + .0) - V/(c)|| 

= ||V/(c)-V/(c)|| (3-3.32) 
= 0. 

Hence 

lim ^ = 0. (3.3.33) 

h^O ||h|| 

That is, R(h) is o(h) and A is the best affine approximation to / at c. Thus we have the 
following fundamental theorem. 

Theorem If / : IR n — > R is C 1 on an open ball containing the point c, then / is 
differentiate at c. 

Example Suppose / : 1R 2 — > W is defined by 

f(x,y)=4-2x 2 -y 2 . 
To find the best affine approximation to / at (1,1), we first compute 

V/(x,y) = (-4x,-2y). 
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Thus V/(l, 1) = (—4, —2) and /(l, 1) = 1, so the best affine approximation is 



A(x,y) = (-4,-2).(x-l,y-l) + l. 



Simplifying, we have 



A(x,y) = -4x-2y + 7. 



Example Suppose / : R — > R is defined by 



f(x,y,z) = a/x 2 +y 2 + z 2 . 



Then 



Vf(x,y,z) 



(x,y,z). 



\Jx 2 + y 2 + z 2 

Thus, for example, the best affine approximation to / at (2, 1, 2) is 



A(x, y, z) = V/(2, 1, 2) • (x - 2, y - 1, z - 2) + /(2, 1, 2) 
= -(2,l,2)-(x-2,y-l,z-2) + 3 
= -(a; - 2) + -(y - 1) + -(z - 2) + 3 



1 



3 
2 



= -aH — y H — ^. 
3 3 y 3 

Now suppose we let (x, be the lengths of the three sides of a solid block, in which case 
f(x, y, z) represents the length of the diagonal of the box. Moreover, suppose we measure 
the sides of the block and find them to have lengths x = 2 + e x ,y = l + e y , and z = 2 + e 2 , 
where \e x \ < h, \e y \ < /i, and \e z \ < h for some positive number h representing the limit of 
the accuracy of our measuring device. We now estimate the diagonal of the box to be 



/(2,1,2) = 3 



with an error of 



|/(2 + e x ,l + e y , 2 + e z ) - /(2, 1,2)| w \A{2 + e x , 1 + e y , 2 + e z ) - 3| 



, ,'2 1 2 

<M 3 + 3 + 3 



■h. 
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That is, we expect our error in estimating the diagonal of the block to be no more that | 
times the maximum error in our measurements of the sides of the block. For example, if 
the error in our length measurements is off by no more than ±0.1 centimeters, then our 
estimate of the diagonal of the box is off by no more than ±0.17 centimeters. 

Note that if A : R n ->■ R is the best affine approximation to / : lR n ->■ R at c = 
(ci, C2, • • • , c n ), then the graph of A is the set of all points (xi,X2, ■ ■ ■ ,x n ,z) in M. n+1 
satisfying 

Z = V/(c) • (xi - C!,X 2 - c 2 , ■ ■ ■ ,x n - C n ) + /(c). (3.3.34) 

Letting 

d s d , d 



,, = ^ c >•^•••••^ c >•- 1 ;• <3 - 3 ' 35) 

we may describe the graph of A as the set of all points in R n+1 satisfying 

n ■ (x 1 - ci, x 2 - c 2 , • • • , x n - Cn, z - /(c)) = 0. (3.3.36) 

Thus the graph of A is a hyperplane in R n+1 passing through the point (ci, C2, . . . , c n , /(c)) 
(a point on the graph of /) with normal vector n. 

Definition If A : R n — > R is the best affine approximation to / : lR n — > R at c = 
(ci, C2, • • • , c n ), then we call the graph of A the tangent hyperplane to the graph of / at 
(ci,c 2 ,...,c n ,/(c)). 

Example We saw above that the best affine approximation to 

f(x,y)=4-2x 2 -y 2 

at (1, 1) is 

A(x, y) = 7 -Ax - 2y. 
Hence the equation of the tangent plane to the graph of / at is 

z = 7 - Ax - 2y, 

or 

Ax + 2y + z = 7. 

Note that the vector n = (4, 2, 1) is normal to the tangent plane, and hence normal to the 
graph of / at (1, 1, 1). The graph of / along with the tangent plane at (1, 1, 1) is shown in 
Figure 3.3.2. 

The chain rule 

Suppose ip : R — > R n is differentiable at a point c and / : IR n — > R is differentiable at the 
point (p(c). Then the composition of / and (p is a function / o ip : R — > R. To compute the 
derivative of / o <p at c, we must evaluate 

(/ o „)'( C ) = lim l^±RzIim = lim /Mc + /»))-/Mc» 3 3 

^^•o h h— >-o h 
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Figure 3.3.2 A plane tangent to the graph of f(x,y) = 4 — 2x 2 — y 2 
Let A be the best affine approximation to / at a = f(c) and let k = <p(c + h) — <p(c). Then 



f(<p(c + h)) = /(a + k) = A(a + k) + R(k), 
where -R(k) is o(k). Now 

A(a + k) = V/(a)-k + /(a), 



(3.3.38) 
(3.3.39) 



so 



f(tp(c + h))-f{tp{c)) = f{R + k)-f{a) 

= V/(a)-k + i2(k) 
= V/(a)-(^(c + /i)-^(c)) + i?(k). 

Substituting (3.3.40) into (3. 3. 37), we have 

V/(a)-(^( C + /i)-^(c))+ J R(k) 



(3.3.40) 



(fo<p)'(c) = lim 



h 



fa v/ ( .) ■ gfelMzigW + a, 

h-K) Al h-»0 Al 

V/(a).^(c)+lim^. 

h-»0 Al 



(3.3.41) 
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Now R(k) is o(k), so 

k^O ||k|| 

from which it follows that, for any given e > 0, we have 

ffi! < « (3.3.42) 

for sufficiently small k ^ 0. Since -R(O) = 0, it follows that 

|i?(k)| < e||k|| (3.3.43) 

for all k sufficiently small. Moreover, (p is continuous at c, so we may choose h small 
enough to guarantee that 

k = <p(c + h) — <p(h) 
is small enough for (3.3.43) to hold. Hence for sufficiently small h ^ 0, 

l#( k )l ellkll 

< (3.3.44) 

ft ft 

Now 

jjkj| = 11^+^-^)11 = n^ (c)|| (3 . 3 . 45) 
and the choice of e was arbitrary, so it follows that 

lim ^ffi = o. (3.3.46) 

Hence 

(/o^)'(c) = V/(a).ZMc). (3-3.47) 
This is a version of the chain rule. 

Theorem Suppose (p : R — >■ M n is differentiable at c and / : lR n — > R is differentiable at 
y?(c). Then 

U<><P)'ip) = Vf{<p{c))-D<p{c). (3.3.48) 

If we imagine a particle moving along the curve C parametrized by <p, with velocity 
v(£) and unit tangent vector T(t) at time t, then (3.3.48) says that the rate of change of 
/ along C at p(c) is 

V/(y>(c)) ' v(c) = || v(c) || V/fo>(c)) -T(c) = ||v(c) \\D T(c) f(<p(c)). (3.3.49) 



In other words, the rate of change of / along C is the rate of change of / in the direction 
of T(t) multiplied by the speed of the particle moving along the curve. 
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Example Suppose that the temperature at a point (x, y, z) inside a cubical region of 
space is given by 

T(x,y,z) = 80 - 20xe-^ ix2+y2+z2) . 



Moreover, suppose a bug flies through this region along the elliptical helix parametrized 
by 

(p(t) = (cos(7rt), 2 sin(7rt), t). 

Then 

VT(x, y, z) = e-^ {x2+y2+z2 \2x 2 - 20, 2xy, 2xz) 

and 

Dipif) = (— 7r sin(7rt), 2n cos(-7rt), 1). 

Hence, for example, if we want to know the rate of change of temperature for the bug at 
t = 3, we would evaluate 

Mi) -(-4^ 

and 



so 



(T.^(i)-..(-f^i).(-^..,l 

121 / 3971^ r- 1 

+ V37T + 



— e 720 



4 3 



49.73, 



where the final value has been rounded to two decimal places. Hence at that moment the 
temperature for the bug is increasing at rate of 49.73° per second. We could also express 
this as 

^ = 49.73°. 

dt t=\ 

For an alternative formulation of the chain rule, suppose / : lR n — > R and Xi : R — > R, 
z = 1, 2, . . . , Ti, are all differentiable and let w = f(x\^ X2, • • • ? x n ). If xi, X2, • • • , x n are all 
functions of t, then, by the chain rule, 

dw ( dw dw dw \ ( dx\ dx^ dx 



dt \dx\ dxi" ' dx n J \ dt ' dt ' <i£ 

_ dxi + dx2, + + _<9^_ rfxn ^ 3 

<9xi 8x2 dt dx n dt 
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Example Suppose the dimensions of a box are increasing so that its length, width, and 
height at time t are, in centimeters, 

x = 3t, 



and 



y = t 2 , 



z = t 3 , 



respectively. Since the volume of the box is 

V = xyz, 

the rate of change of the volume is 

dV dVdx dVdy dV dz 



+ 



+ 



3yz + 2xzt + 3xyt z 



dt dx dt dy dt dz dt 
Hence, for example, at t = 2 we have x = 6, y = 4, and z = 8, so 



dV 
~dt 



96 + 192 + 288 = 576 cm 3 /sec. 



t=2 



The gradient and level sets 

Now consider a differentiable function / : IR n — y R and a point a on the level set S specified 
by /(x) = c for some scalar c. Suppose ip : R — y lR n is a smooth parametrization of a curve 
C which lies entirely on S and passes through a. Let ip(b) = a. Then the composition of 
/ and (p is a constant function; that is, 

g(t) = fo<p(t) = f(<p(t)) = c (3.3.51) 

for all values of t. Thus, using the chain rule, 

= g'(b) = V/(^(6)) • D<p(b) = V/(a) • D<p(b). (3.3.52) 

Hence 

V/(a) JL D<p(b). (3.3.53) 

Now Dip(b) is tangent to C at a; moreover, since (3.3.53) holds for any curve in S passing 
through a, V/(a) is orthogonal to every vector tangent to S. In other words, V/(a) is 
normal to the hyperplane tangent to S at a. Thus we have the following theorem. 

Theorem Suppose / : lR n — y R is differentiable on an open ball containing the point a, 
and let S be the set of all points in R n such that /(x) = /(a). If V/(a) ^ 0, then the 
hyperplane with equation 

V/(a) • (x - a) = (3.3.54) 



is tangent to S at a. 
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Figure 3.3.3 Sphere with tangent plane 



For n = 2, the hyperplane described by (3.3.54) will be a tangent line to a curve; for 
n = 3, it will be a tangent plane to a surface. 

Example The set of all points S in M 3 satisfying 

x 2 + y 2 + z 2 = 9 

is a sphere with radius 3 centered at the origin. We will find an equation for the plane 
tangent to S at (2, —1, 2). First note that S is a level surface for the function 

f(x,y,z) = x 2 + y 2 + z 2 . 

Now 

Vf(x,y,z) = (2x,2y,2z), 

so 

V/(2,-l,2) = (4,-2,4). 
Thus an equation for the tangent plane is 

(4,-2,4)-(x-2,y + M-2) = 0, 

or 

Ax - 2y + 4z = 18. 

See Figure 3.3.3. 
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Problems 

1. For each of the following, find the best affine approximation to the given function at 
the specified point c. 

(a) f(x,y)=3x 2 + 4y 2 -2, c = (2, 1) 

(b) g(x,y) =y 2 -x 2 , c = (1,-2) 

(c) g(x,y) =y 2 -x 2 , c= (0,0) 

(d) f(x, y,z) = - log(x 2 +y 2 + z 2 ),c = (1, 0, 0) 

(e) h(w, x, y, z) = w 2 + x 2 + 3y 2 = 2z 2 ,c = (1, 2, -2, 1) 

2. For each of the following, find the equation of the plane tangent to the graph of / for 
the given point c. Plot the graph and the tangent plane together. 

(a) f(x, y)=4x 2 + y 2 ,c = (1, -1) (b) f(x, y) = y^9-x 2 -y 2 , c = (2, 1) 

(c) f(x, y)=9-x 2 -y 2 ,c= (2, -2) (d) f(x, y) = 3y 2 -x 2 ,c = (1, -1) 

3. Suppose A : M. n — > R is the best affine approximation to / : lR n — > M at c. Explain why 
|V/(c) ■ h| is a good approximation for |/(c + h) — /(c) | when ||h|| is small. That is, 
explain why |V/(c) -h| is a good approximation for the error in approximating /(c + h) 
by /(c). 

4. Suppose / : 1R 3 — > R is defined by f(x, y, z) = xyz. 

(a) Find the best affine approximation to / at (3, 2, 4). 

(b) Suppose x, y, and z represent the length, width, and height of a box. Suppose you 
measure the length to be 3 ± h centimeters, the width to be 2 ± h centimeters, and 
the height to be 4 ± h centimeters. Use the best affine approximation from (a) to 
approximate the maximum error you would make in computing the volume of the 
box from these measurements. 

5. A metal plate is heated so that its temperature at a point (x, y) is 

T(x,y) = 50y 2 e-^ x2+y2 \ 
A bug moves along the ellipse parametrized by 

a(t) = (cos(t),2sin(t)). 

Find the rate of change of temperature for the bug at times t = 0, t = \ , and t = | . 

6. Let x, y, and z be the length, width, and height, respectively, of a box. Suppose the 
box is increasing in size so that when x = 3 centimeters, y = 2 centimeters, and z = 5 
centimeters, the length is increasing at rate of 2 centimeters per second, the width at a 
rate of 4 centimeters per second, and the height at a rate of 3 centimeters per second. 

(a) Find the rate of change of the volume of the box at this time. 

(b) Find the rate of change of the length of the diagonal of the box at this time. 
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7. Suppose w = — log(x 2 + y 2 + z 2 ) and (x, y, z) = (4£, sin(£), cos(£). Find 

(it t=f* 

8. The kinetic energy if of an object of mass m moving in a straight line with velocity v 
is 

K 1 2 

il = -mv . 
2 

If, at time t = to, m = 2000 kilograms, v = 50 meters per second, m is decreasing at a 
rate of 2 kilograms per second, and v is increasing at a rate of 1.5 meters per second 
per second, find 

dK 



dt 



t=t 



9. Each of the following equations specifies some curve in IR 2 . In each case, find an 
equation for the line tangent to the curve at the given point a. 

(a) x 2 + y 2 = 5, a = (2, 1) (b) 2x 2 + Ay 2 = 18, a = (1, -2) 

(c) y 2 -x = 0,a=(4,-2) (d) y 2 -x 2 = 5,a=(-2,3) 

10. Each of the following equations specifies some surface in IR . In each case, find an 
equation for the plane tangent to the surface at the given point a. 

(a) x 2 + y 2 + z 2 = 14, a = (2, 1, -3) (b) x 2 + 3y 2 + 2z 2 = 9, a = (2, -1, 1) 

(c) x 2 + y 2 - z 2 = 1, a = (1, 2, 2) (d) xyz = 6, a = (1, 2, 3) 

11. Suppose / : IR 2 — > R is differentiable at (a, 6), /(a, b) = c, and £rf(a, b) ^ 0. Let C be 
the level curve of / with equation f(x,y) = c. Show that 

V = - £ f (x-a)+b 



is an equation for the line tangent to C at (a, b). 
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In one-variable calculus, Taylor polynomials provide a natural way to extend best affine 
approximations to higher-order polynomial approximations. It is possible to generalize 
these ideas to scalar-valued functions of two or more variables, but the theory rapidly 
becomes involved and technical. In this section we will be content merely to point the 
way with a discussion of second-degree Taylor polynomials. Even at this level, it is best 
to leave full explanations for a course in advanced calculus. 

Higher-order derivatives 

The first step is to introduce higher order derivatives. If / : M. n — y R has partial derivatives 
which exist on an open set U, then, for any i = 1, 2, 3, . . . , n, J^- is itself a function from IR n 

to R. The partial derivatives of 4^-, if they exist, are called second-order partial derivatives 

OX i 

of /. We may denote the partial derivative of J^- with respect to Xj, j = 1,2,3,..., 
evaluated at a point x, by either dx 9 dx . /(x), or / x . x ,.(x), or D XiXj f(jc). Note the order 
in which the variables are written; it is possible that differentiating first with respect to 
Xi and second with respect Xj will yield a different result than if the order were reversed. 
If j = i, we will write ^-/(x) for dx d dx . /(x). It is, of course, possible to extend this 
notation to third, fourth, and higher-order derivatives. 

Example Suppose f(x,y) = x 2 y — 3xsin(2y). Then 

f x (x,y) = 2xy - 3sin(2y) 

and 

f y (x,y) = x 2 - 6xcos(2y), 

so 

fxx(x,v) = 2y, 
fxy(x,y) = 2x - 6cos(2y), 
fyy( x ,y) = 12xsin(2y), 

and 

f yx {x,y) = 2x - 6cos(2y). 

Note that, in this example, f xy (x,y) = f yx (x,y). For an example of a third-order deriva- 
tive, 

fyx V (x,y) = 12sin(2y). 
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and 



Also, 



and so 



Suppose 


w = 


xy 2 z 3 


d 2 w 


d 


( dw 


dydx 


dy 


\~dx~ 


„ 

d w 


Ft 
O 


1 dw 


dz 2 


= d~z 


\dz~ 


d 2 w 




( dw\ 


dxdy 


-i 

dx 





Axylog(z). Then, for example, 



d 



= — {y 2 z A - Ay log(z)) = 2yz A - 4 log(z) 

d_ 
~ d~z 

d 



2 2 ^ X V 

6xy z 



2 4xy 
6xy z H 



dx 



(2xyz 6 - 4a; log(^)) = 2y^' :i - 4 log(^), 



9y9a; dxdy 

In both of our examples we have seen instances where mixed second partial derivatives, 
that is, second-order partial derivatives with respect to two different variables, taken in 
different orders are equal. This is not always the case, but does follow if we assume that 
both of the mixed partial derivatives in question are continuous. 

Definition We say a function / : IR n — > R is C 2 on an open set U if f XjXi is continuous 
on U for each % = 1, 2, . . . , n and j = 1, 2, . . . , n. 

Theorem If / is C 2 on an open ball containing a point c, then 



d 2 



dxjdx 



■/(c) 



d 2 



dxidxj 



,n. 



for i = 1, 2, . . . , n and j = 1, 2, 

Although we have the tools to verify this result, we will leave the justification for a 
more advanced course. 

We shall see that it is convenient to use a matrix to arrange the second partial deriva- 
tives of a function /. If / : IR n — > R, there are n 2 second partial derivatives and this matrix 
will be n x n. 



Definition Suppose the second-order partial derivatives of / 
point c. We call the n x n matrix 



all exist at the 



Hf(c) = 



dx\dx2 
d 2 

dx\dxs 

d 2 ' 

dx\dx n 

the Hessian of / at c. 



d' 2 



/(c) 



d 2 



/(c) 



d 2 



dx\ J v ' dx2dx\ 

d 2 d 2 

/(c) 7^/(c) 



/(c) 



/(c) 



dxi 
d 2 



dx 2 dx 3 

d 2 ' 

dx 2 dx n 



/(c) 



/(c) 



dxsdxi 
d 2 

dxsdx 2 
d 2 

dx 2 ^ 



/(c) 
/(c) 



d 2 



d 2 



dx 3 dx n 



/(c) 



dx n dx\ 
d 2 

dx n dx 2 
d 2 

dx n dx 3 
d 2 ' 



/(c) 
/(c) 
/(c) 



dx 2 n 



/(c) 



(3.4.1) 
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Put another way, the Hessian of / at c is the n x n matrix whose ith row is V/ Xi (c). 
Example Suppose f(x 7 y) = x 2 y — 3xsin(2y). Then, using our results from above, 



Hf(x,y) = 
Thus, for example, 



fxxi^ill) fxyip^iV) 

fyx(x,y) f yy (x,y) 



Hf (2,0) 



2y 2x — 6 cos(y) 

2x — 6 cos(2y) 12a; sin(2y) 



-2 
-2 



Suppose / : M n — > R is C 2 on an open ball _B 2 (c,r) and let h = (/ii,/i 2 ) be a point 
with ||h|| < r. If we define (p : R ->■ R by </?(t) = /(c + £h), then <^(0) = /(c) and 
(p(l) = f(c + h). From the one- variable calculus version of Taylor's theorem, we know that 

^(l)=V?(0)+y/(0) + ^''( S ), (3.4.2) 

where s is a real number between and 1. Using the chain rule, we have 

<p'(t) = V/(c + th) ■ j t (c + th) = V/(c + th) • h = f x (c + th)h x + f y (c + th)h 2 (3.4.3) 



and 



(p"(t) = h&Uic + th) • h + /i 2 V/ y (c + th) ■ h 

= (/nV/x(c + th) + /i 2 V/ y (c + th) • h 

/ xx (c = th) / xy (c + th) 
/ yx (c + th) / yi/ (c + th) 



= [/ii /i 2 
= h T if/(c + th)h, 
where we have used the notation 

h = 

and 





"V 







(3.4.4) 



h 2 



h T = [/ii /i 2 ], 

the latter being called the transpose of h (see Problem 12 of Section 1.6). Hence 

^'(0) = V/(c) • h 

and 

(p"(s) = -h T Hf(c+sh)h, 



so, substituting into (3.4.2), we have 



1. 



/(c + h) = p(l) = /(c) + V/(c) • h + ^h T Hf(c + sh)h. 



(3.4.5) 
(3.4.6) 

(3.4.7) 



This result, a version of Taylor's theorem, is easily generalized to higher dimensions. 
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Theorem Suppose / : IR n — > R is C 2 on an open ball B n (c, r) and let h be a point with 
lib. II < r. Then there exists a real number s between and 1 such that 



/(c + h) = /(c) + V/(c) • h + \\i T Hf(c + sh)h. 



(3.4.8) 



If we let x = c + h and evaluate the Hessian at c, (3.4.8) becomes a polynomial 
approximation for /. 

Definition If / : IR n — > R is C 2 on an open ball about the point c, then we call 



P 2 (x) = /(c) + V/(c) • (x - c) + -(x - c) T tf/(c)(x - c) 



(3.4.9) 



the second-order Taylor polynomial for / at c. 



Example To find the second-order Taylor polynomial for f(x,y) = e 2x+y at (0,0), we 
compute 

Vf(x,y) = (-2e- 2x +y,e- 2x+ y) 



and 

from which it follows that 
and 



Hf(x,y) 



4e~ 2x+y —2e~ 2x+y 
—2e~ 2x+y e ~ 2x +y 



V/(0,0) = (-2,l) 
^/(0,0) = 



4 -2 
-2 1 



Then 



P 2 (x, y) = /(0, 0) + V/(0, 0).( x ,y) + ±[x y) Hf(0, 0) 



l + (-2,l)-(x,y) + ^[x y] 



4 -2 
-2 1 



1 - 2x + y = - [x y 



Ax - 2y 
-2x + y 



= l-2x + y+ -{Ax 2 - 2xy - 2xy + y 2 ) 
= 1 - 2x + y + 2x 2 - 2xy + \) 2 . 



Symmetric matrices 

Note that if / : M 2 — > R is C 2 on an open ball about the point c, then the entry in the 
ith row and jth column of Hf(c) is equal to the entry in the jth row and zth column of 
Hf(c) since 

d 2 d 2 



dxjdxi ' 



dxidxj ' 
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Definition We call a matrix M = [a^] with the property that a^- = ciji for all % ^ j a 
symmetric matrix. 



Example The matrices 



and 



2 1 
1 5 



1 2 

2 4 



are both symmetric, while the matrices 



3 5-7 



2 -1 

3 4 



and 



3 
4 

-2 4 -6 



2 1 

2 3 



are not symmetric. 

Example The Hessian of any C 2 scalar valued function is a symmetric matrix. For 
example, the Hessian of f(x,y) = e~ 2x+y , namely, 



Hf(x,y) 



4e~ 2x+y —2e~ 2x+y 
—2e~ 2x+y e~ 2x+y 



is symmetric for any value of (x, y). 

Given an n x n symmetric matrix M, the function q 

g(x) = x T Mx 



R defined by 



is a quadratic polynomial. When M is the Hessian of some function /, this is the form of 
the quadratic term in the second-order Taylor polynomial for /. In the next section it will 
be important to be able to determine when this term is positive for all x ^ or negative 
for all x ^ 0. 

Definition Let M be an n x n symmetric matrix and define q : R n — > R by 

<?(x) = x T Mx. 

We say M is positive definite if g(x) > for all x ^ in IR n , negative definite if g(x) < 
for all x 7^ in R n , and indefinite if there exists anx^O for which q(x) > and anx^O 
for which g(x) < 0. Otherwise, we say M is nondefinite. 



6 



Second- Order Approximations 



Section 3.4 



In general it is not easy to determine to which of these categories a given symmetric 
matrix belongs. However, the important special case of 2 x 2 matrices is straightforward. 
Consider 

a b 



M 



and let 



q(x,y) = [x y]M 



ax + 2bxy + cy z 



If a ^ 0, then we may complete the square in (3.4.10) to obtain 



2b 



q(x, y) = a ( x + ~ X V ) + C V^ 



+ cy< 




= a x + -y 
a 



| det(M) ^ 2 



a 



(3.4.10) 



(3.4.11). 



Now suppose det(M) > 0. Then from (3.4.11) we see that q(x, y) > for all (x, y) ^ (0, 0) 
if a > and q(x, y) < for all (x, y) ^ (0, 0) if a < 0. That is, M is positive definite if 
a > and negative definite if a < 0. If det(M) < 0, then q(l, 0) and q (— f , l) will have 
opposite signs, and so M is indefinite. Finally, suppose det(M) = 0. Then 



q{x,y) =aix+-y 



so g(x, y) = when x = —\y- Moreover, q(x, y) has the same sign as a for all other values 
of (x,y). Hence in this case M is nondefinite. 

Similar analyses for the case a = give us the following result. 



Theorem Suppose 



M = 



a b 
b c 



If det(M) > 0, then M is positive definite if a > and negative definite if a < 0. If 
det(M) < 0, then M is indefinite. If det(M) = 0, then M is nondefinite. 



Example The matrix 



M 



2 1 
1 3 



is positive definite since det(M) = 5 > and 2 > 0. 
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Example The matrix 



M = 



-2 
1 



is negative definite since det(M) = 7 > and — 2 < 0. 
Example The matrix 

M = 



-3 1 
1 2 



is indefinite since det(M) = — 7 < 0. 
Example The matrix 

is nondefinite since det(M) = 0. 



M 



4 2 
2 1 



In the next section we will see how these ideas help us identify local extreme values 
for scalar valued functions of two variables. 

Problems 

1. Let f(x,y) = x 3 y 2 — 4x 2 e~ 3y . Find the following. 



(c) J^/(*,y) 

(6) dxW 2f ^ y) 
(g) fyy(x,y) 



d 2 

( b ) »V/(^y) 



(d) 



dydx' 
d 3 
dxdydx 



f{x,y) 



(f) ^/(x,y) 
(h) f yxy {x,y) 



xy 



2. Let f(x,y,z) = — = ^ Find the following. 

x 2 + y 2 + z 2 



(a) 



d 2 



dzdx 



f(x,y,z) 



d 



( c ) -^f( x ^y^ z ) 



(b) 



(d) 



5 2 



dydz 



f(x,y,z) 



Q3 



J{x,y,z) 



Q Z 2 J v / v / Q x QyQ z ■ 

(e) f Z yx(x,y,z) (f) f yyy (x,y,z) 

3. Find the Hessian of each of the following functions. 

(a) f(x,y)=3x 2 y — 4xy 3 (b) ^(x, y) = 4e _x cos(3y) 

(c) y, 2) = 4:ryV (d) /(x, y,z) = - log(x 2 + y 2 + z 2 ) 

4. Find the second-order Taylor polynomial for each of the following at the point c. 

(a) f(x, y) = xe~ y , c = (0, 0) (b) g(x, y) = x sin(x + y), c = (0, 0) 
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(c) f(x,y) = 



1 



(1,1) 



(d) g(x,y,z) = e*- 2 y+ 3z ,c = (0,0,0) 



x + y 

5. Classify each of the following symmetric 2x2 matrices as either positive definite, 
negative definite, indefinite, or nondefinite. 



(a) 
(c) 
(e) 



3 2 
2 4 

-2 
3 

1 
1 



(b) 
(d) 
(f) 



1 2 

2 2 

1 

1 

8 4' 

4 2 



6. Let M be an n x n symmetric nondefinite matrix and define q 

g(x) = x T Mx. 



by 



Explain why (1) there exists a vector a^O such that q(a) = and (2) either g(x) > 
for all x in R n or g(x) < for all x in 1". 

7. Suppose / : lR n — > R is C 2 on an open ball S n (c, r), V/(c) = 0, and Hf(x) is positive 
definite for all x in B n (c,r). Show that /(c) < /(x) for all x in B n (c,r). What would 
happen if if/(x) were negative definite for all x in £? n (c,r)? What does this say in 
the case n = 1? 



8. Let 



xy(x 2 — y 2 ) 



0, 



x 2 + 



if (x,y)^(0,0), 
if (x,y) = (0,0). 



(a) Show that f x (0,y) = —y for all y. 

(b) Show that f y (x, 0) = x for all x. 

(c) Show that / yx (0,0) ^/^ (0,0). 

(d) Is / C 2 1 
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After a few preliminary results and definitions, we will apply our work from the previous 
sections to the problem of finding maximum and minimum values of scalar- valued functions 
of several variables. The story here parallels to a great extent the story from one- variable 
calculus, with the inevitable twists and turns due to the presence of additional variables. 
We will begin with a definition very similar to the analogous definition for functions of a 
single variable. 

The Extreme Value Theorem 

Definition Suppose / : R n — > R is defined on a set S. We say / has a maximum value 
of M at c if /(c) = M and M > /(x) for all x in S. We say / has a minimum value of m 
at c if /(c) = m and m < /(x) for all x in S. 

The maximum and minimum values of the previous definition are sometimes referred 
to as global maximum and minimum values in order to distinguish them from the local 
maximum and minimum values of the next definition. 

Definition Suppose / : R" — > R is defined on a open set U. We say / has a local 
maximum value of M at c if /(c) = M and M > /(x) for all x in B n (c, r) for some r > 0. 
We say / has a local minimum value of m at c if /(c) = m and m < /(x) for all x in 
B n (c,r) for some r > 0. 

We will say extreme value, or global extreme value, when referring to a value of / 
which is either a global maximum or a global minimum value, and local extreme value 
when referring to a value which is either a local maximum or a local minimum value. 

In one-variable calculus, the Extreme Value Theorem, the statement that every con- 
tinuous function on a finite closed interval has a maximum and a minimum value, was 
extremely useful in searching for extreme values. There is a similar result for our current 
situation, but first we need the following definition. 

Definition We say a set S in R n is bounded if there exists an r > such that S is 
contained in the open ball B n (0,r). 

Equivalently, a set S is bounded as long as there is a fixed distance r such that no 
point in S is farther away from the origin than r. 

Example Any open or closed ball in R n is a bounded set. 

Example The infinite rectangle 

{(x,y):l<^<3,— oo<y<oo} 

is not bounded. 
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Extreme Value Theorem Suppose / : 1" — > R is continuous on an open set U. If S 
is a closed and bounded subset of U, than / has a maximum value and a minimum value 
on S. 

We leave the justification of this theorem for a more advanced course. 

Our work now is to find criteria for locating candidates for points where local extreme 
values might occur, and then to classify these points once we have found them. To begin, 
suppose we know / : R n — > R is differentiable on an open set U and that it has a local 
extreme value at c. Then for any unit vector u, the function g : R — > R defined by 
g(t) = /(c + tu) must have an extreme value at t = 0. Hence, from a result in one- variable 
calculus, we must have 

= g'(0) = D u f(c) = V/(c) • u. 
Since u was an arbitrary unit vector in R n , we have, in particular, 

= V/(c)-e fc = ^-/(c) 

for i = 1, 2, • • • , n. That is, we must have V/(c) = 0. Note that, by itself, V/(c) = only 
says that the slope of the graph of / is in the direction of the standard basis vectors, but 
this in fact implies that the slope is in all directions because D u f(c) = V/(c) • u for any 
unit vector u. 

Theorem If / : R™ — > R is differentiable on an open set U and has a local extreme value 
at c, then V/(c) = 0. 

Definition If / : R n — > R is differentiable at c and V/(c) = 0, then we call c a critical 
point of /. We call a point c at which / is not differentiable a singular point of /. 

Recall that to find the extreme values of a continuous function / : R — > R on a closed 
interval, we need only to evaluate / at all critical and singular points inside the interval 
as well as at the endpoints of the interval, and then inspect these values to identify the 
largest and smallest. The story is similar in the situation of a function / : R n — > R which 
is defined on a closed and bounded set S and is continuous on some open set containing 
S, except instead of having endpoints to consider, we have the entire boundary of S to 
consider. 

Definition Let S be a set in R n . We call a point a in I" a boundary point of S if for 
every r > 0, the open ball B n (a, r) contains both points in S and points outside of S. We 
call the set of all boundary points of S the boundary of S. 

Example The boundary of the closed set 

£ 2 ((0,0),3) = {(x,y):x 2 + y 2 <9} 

is the circle 

5 1 ((0,0),3) = {(x,y):x 2 + y 2 = 9}. 
Example In general, the boundary of the closed ball B n (a,r) is the sphere 5 n_1 (a, r). 
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Example The boundary of the closed rectangle 

R = {(x,y) : 1 <x <3,2<y < 5} 

consists of the line segments from (1,2) to (3,2), (3,2) to (3,5), (3,5) to (1,5), and (1,5) 
to (1,2). 

Example Suppose we wish to find the global extreme values for the function f(x,y) = 
x 2 + y 2 on the closed set 

D = {(x,y):x 2 + 4y 2 < 4}. 
We first find all the critical and singular points. Now 

Vf(x,y) = (2x,2y), 

so 

V/(x,y) = (0,0) 

if and only if 

2x = 0, 
2y = 0. 

Hence the only critical point is (0,0). There are no singular points, but we must consider 
the boundary of S, the ellipse 

B = {(x,y):x 2 + 4y 2 = A}. 

Now we may use 

<p(t) = (2cos(t),sin(t)), 

< t < 2tt, to parametrize B. It follows that any extreme value of / occurring on B will 
also be an extreme value of 

git) = f(<p(t)) 

= /(2cos(t),sin(*)) 
= 4cos 2 (t) + sin 2 (t) 
= 4cos 2 (t) + (1 - cos 2 (t)) 
= 3cos 2 (t) + 1 

on the closed interval [0, 2ir]. Now 

g'(t) = — 6 cos(t) sin(t), 

so the critical points of g occur at points t in (0, 2tt) where either cos(t) = or sin(t) = 0. 
Hence the critical points of g are t = ^, t = it, and t = Moreover, we need to consider 
the endpoints t = and t = 2n. Hence we have four more candidates for the location 
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Figure 3.5.1 Graph of f(x, y) = x 2 + y 2 on D = {(x, y) : x 2 + Ay 2 < 4} 

of extreme values, namely, ip(0) = ifi(2-ir) = (2,0), ¥>(§■) = (0,1), <^(vr) = (—2,0), and 
if (^) = (0, —1). Evaluating / at these five points, we have 

/(0,0) = 0, 

/(2,0) = 4, 

/(0,1) = 1, 
/(-2,0) = 4, 

and 

/(0,-l) = l. 

Comparing these values, we see that / has a maximum value of 4 at (2, 0) and (—2, 0) and 
a minimum value of at (0, 0). See Figure 3.5.1 for the graph of / on the set D. 

As the previous example shows, dealing with the boundary of a region can require a 
significant amount of work. In this example we were helped by the fact that the boundary 
was one- dimensional and was easily parametrized. This is not always the case. For exam- 
ple, the boundary of the closed ball £ 3 ((0, 0, 0), 1) in M 3 is the sphere S 2 ((0, 0, 0), 1) with 
equation 

x 2 +y 2 + z 2 = 1, 

a two-dimensional surface. We shall see in Chapter 4 that it is possible to parametrize such 
surfaces, but that would still leave us with a two-dimensional problem. We will return to 
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this problem later in this section when we present a much more elegant solution based on 
our knowledge of level sets and gradient vectors. 

Finding local extrema 

For now we will turn our attention to identifying local extreme values. Recall from one- 
variable calculus that one of the most useful ways to identify a local extreme value is 
through the second derivative test. That is, if c is a critical point of tp : K — > M, then 
tp"(c) > implies that tp has a local minimum at c and tp" (c) < implies tp has a local 
maximum at c. Taylor's theorem provides an easy way to see why this is so. For example, 
suppose c is a critical point of tp, tp" is continuous on an open interval containing c, and 
tp"(c) > 0. Then there is an interval I = (c — r,c + r), r > 0, such that tp" is continuous 
on I and tp" {t) > for all t in I. By Taylor's theorem, for any h with \h\ < r, there is a 
number s between c and c + h such that 

tp(c + h) = tp(c) + tp'(c)h + ^tp"(s)h 2 = tp(c) + ^tp"(c)h 2 > tp(c), (3.5.1) 

where we have used the fact that tp'(c) = since c is a critical point of tp. Hence tp(c) is a 
local minimum value of tp. 

Similar considerations lead to a second derivative test for a function / : R n — > K. 
Suppose c is a critical point of /, / is C 2 on an open set containing c, and Hf(c) is 
positive definite. Let B n (c,r), r > 0, be an open ball on which / is C 2 and Hf(c) is 
positive definite. Then, by the version of Taylor's theorem in Section 3.4, for any h with 
||h|| < r, there is a number s between and 1 such that 

/(c + h) = /(c) + V/(c).h+ V#/(c + S h)h = /(c) + ^h T Hf(c + sh)h > /(c), (3.5.2) 

where V/(c) = since c is a critical point of /, and the final inequality follows from 
the assumption that Hf(x) is positive definite for x in B n (c,r). Hence /(c) is a local 
minimum value of /. The same argument shows that if Hf(c) is negative definite, then 
/(c) is a local maximum value of /. If Hf(c) is indefinite, then there will be arbitrarily 
small h for which 

^-h T Hf(c + sh)h > 

and arbitrarily small h for which 

\ T Hf(c + sh)h < 0. 

Hence there will be arbitrarily small h for which /(c + h) > /(c) and arbitrarily small 
h for which /(c + h) < /(c). In this case, /(c) is neither a local minimum nor a local 
maximum. In this case, we call c a saddle point. Finally, if Hf(c) is nondefinite, then 
we do not have enough information to classify the critical point. We may now state the 
second derivative test. 
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Second derivative test Suppose / : M n — > R is C 2 on an open set U. If c is a critical 
point of / in U, then /(c)is a local minimum value of / if Hf(c) is positive definite, /(c) is 
a local maximum value of / if Hf(c) is negative definite, and c is a saddle point if Hf(c) 
is indefinite. If Hf(c) is nondefinite, then more information is needed in order to classify 
c. 

The next example gives an indication for the source of the term saddle point. 
Example To find the local extreme values of f(x, y) = x 2 — y 2 , we begin by finding 



Now 



if and only if 



V/(x,y) = (2x,-2y). 

V/(x,y) = (0,0) 

2x = 0, 
-2y = 0, 



which occurs if and only if x = and y = 0. Thus / has the single critical point (0,0). 
Now 

2 
Hf(x,y)= ' 



so 



Hf (0,0) 



-2 

2 
-2 



Thus 



det(F/(0,0)) = (2)(-2) = -4<0. 



Hence Hf(0, 0) is indefinite and so, by the second derivative test, (0, 0) is a saddle point. 
Looking at the graph of / in Figure 3.5.2, we can see the reason for this: since f(x, 0) = x 2 
and f(0,y) = —y 2 , the slice of the graph of / above the x-axis is a parabola opening 
upward while the slice of the graph of / above the y-axis is a parabola opening downward. 

Example Consider f(x,y) = xye~^ x ' l+ y 2 \ Then 

Vf(x,y) = e-^ +y2 \y-2x 2 y,x- 2xy 2 ). 
Hence, since e~( x +y ) > for all (x,y), 

V/(x,y) = (0,0) 

if and only if 

y - 2x 2 y = 0, 
x - 2xy 2 = 0, 
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which occurs if and only if 

y(l-2x 2 ) = 0, 
x(l - 2y 2 ) = 0. 

Now the first equation is satisfied if either y = or 1 — 2x 2 = 0. If y 
equation becomes x = 0, so (0, 0) is a critical point. If 1 — 2x 2 

or x = 4|. For either of these values of x, the second equation is satisfied if and only 
if 1 — 2y 2 = 0, that is, y = — ^ or y = A~. Hence we have four more critical points: 

Now 



0, then the second 
0, then either x = — 



V2> V2 )> 



1 1 

V2> V2 



V2 



and (A' A 



Hf(x,y) = e 



4x 3 y — 6xy 
Ax 2 y 2 - 2x 2 - 2y 2 + 1 



4a; V - 2x 2 - 2y 2 + 1 
4y 3 x — 6xy 



so 



i?/(0,0) 



1 1 



1 

l i 

Hf {^72 



-2 
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Figure 3.5.3 Graph of f(x,y) = xye' 



and 



Since 



and 



Hf 



1 1 



det 



Hf 



1 

1 



1 



1 



y/2' V2 
= -1< 



det 



det 



-2c" 1 


'2e~ x 






-2c" 1 


2e~ x 



4e~ z > 0, 



Ae~ 2 > 0, 



we see that Hf '(0,0) is indefinite, Hf ^— and if/ are negative defi- 

nite, and Hf ^— ^=J and fl"/ ^— -^j are positive definite. Thus (0,0) is a saddle 



point of /, / has local maximums of \e 1 at both ( — — and f^g, ^ )• and local 



Finding global extrema 

The graph of f(x,y) = xye 



in Figure 3.5.3 suggests that local extreme values 



found in the previous example are in fact global extreme values for / on all of 1 
verify that this in fact the case as follows. First note that, since 



We may 



lim r 2 e 



0. 
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we may choose R large enough so that 

2 -r 2 1 -1 

r z e < -e 

2 

whenever r > R. Now for any point (x, y) with \\(x, y)\\ = r > R we have 

\f(x,y)\ = \xye-^ +y ^\ = \x\\y\ e -^ 2+y ^ < r 2 e~ r2 < ^e" 1 . 

Hence f(x,y) is between —\e~ x and \e~ x for all points (x,y) outside of the closed disk 
D = B 2 ((0, 0), R). Moreover, since f(x,y) is between — \e~ x and \e~ x for all points (x,y) 
on the boundary of D, f has a minimum value of —\e~ X and a maximum value of \e~ l 
on D. Hence these values are actually the global extreme values of / on all of M 2 . 

Example A farmer wishes to build a rectangular storage bin, without a top, with a 
volume of 500 cubic meters using the least amount of material possible. If we let x and y 
be the dimensions of the base of the bin and z be the height, all measured in meters, then 
the farmer wishes to minimize the surface area of the bin, given by 

S = xy + 2xz + 2yz, (3.5.3) 

subject to the constraint on the volume, namely, 

500 = xyz. 

Solving for z in the latter expression and substituting in to (3.5.3), we have 

/500\ /500\ 1000 1000 

S = xy + 1x \ + 1y\ = xy + + . 

V xy ) \ xy ) y x 

This is the function we need to minimize on the infinite open rectangle 

R={(x,y) :x>0,y> 0}. 

Now 

dS 1000 

y 



dx x 2 
and 

dS_ _ _ 1000 

dy y 2 

so to find the critical points of S we need to solve 

1000 

1000 n 
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Solving for y in the first of these, we have 



V = 



1000 



which, when substituted into the second, gives us 



x 



Hence we want 



x 



x 1 



1000 



= 0. 



x 3 \ 
1000 J 



o, 



from which it follows that either x = or x = 10. Since the first of these will not give us 
a point in R, we have x = 10 and 

1000 



Thus the only critical point is (10, 10). Now 



HS(x,y) 



2000 



x° 



1 

2000 

y 3 



so 



#5(10,10) 



2 1 
1 2 



Thus 



det(#S(10, 10)) = 3, 
and so #5(10, 10) is positive definite. This shows that S has a local minimum of 



lx=10,jy=10 



, . . . 1000 1000 
= (10)(10) + + — ~m — = 300 



10 



10 



at (x,y) = (10,10). To show that this is actually the global minimum value of S, we 
proceed as follows. Let D be the closed rectangle 



D = {(x, y) : 1 < x < 400, 1 < y < 400}. 



Now if < x < 1, then 



1000 



x 



> 1000, 



and so S > 300. Similarly, if < y < 1, then S > 300. Moreover, if x > 400 and y > 1, 
then xy > 400, and so S > 300. Similarly, if y > 400 and x > 1, then S > 300. Hence 
S > 300 for all (x, y) outside of D and for all (x, y) on the boundary of D. Hence S has a 
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Figure 3.5.4 Graph of S = xy H 1 

x y 

global minimum of 300 on D, which, from the preceding observations, must in fact be the 
global minimum of S on all of R. See the graph of S in Figure 3.5.4. Finally, when x = 10 
and y = 10, we have 

500 

z ~ (ToXio) " 5 ' 

so the farmer should build her bin to have a base of 10 meters by 10 meters and a height 
of 5 meters. 

Lagrange multipliers 

This last example has much in common with our first example in that they both involve 
finding extreme values of a function restricted to a lower-dimensional subset. In our first 
example, we had to find the extreme values of f(x, y) = x 2 + y 2 restricted to the one- 
dimensional ellipse with equation x 2 + Ay 2 = 4; in the example we just finished, we had 
to find the minimum value of S = xy + 2xz + 2yz, a function of three variables, restricted 
to the two-dimensional surface defined by the equation xyz = 500. Although they were 
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similar, we approached these problems somewhat differently. In the first, we parametrized 
the ellipse and then maximized the composition of / with this parametrization; in the 
latter, we solved for z in terms of x and y and then substituted into the formula for S to 
make S effectively a function of two variables. Now we will describe a general approach 
which applies to both situations. Often, but not always, this method is easier to apply 
then the other two techniques. In practice, one tries to select the method that will yield 
an answer with the least resistance. 

For the general case, consider two differentiable functions, / : R n — > R and g : R n — >■ R, 
and suppose we wish to find the extreme values of / on the level set S of g determined by 
the constraint g(x) = 0. If / has an extreme value at a point c on S, then /(c) must be an 
extreme value of / along any curve passing through c. Thus if <p : R — > R™ parametrizes 
a curve in S with ip(b) = c, then the function h(t) = f(ip(t)) has an extreme value at b. 
Hence 

= h'(b) = V/(v?(6)) • D<p(b) = V/(c) • D<p(b). (3.5.4) 

Since (3.5.4) holds for any curve in S through c and D(p(b) is tangent to the given curve 
at c, it follows that V/(c) is orthogonal to the tangent hyperplane to S at c. But S is a 
level set of g, so we know from our work in Section 3.3 that the vector Vg(c), provided 
it is nonzero, is a normal vector for the tangent hyperplane to S at c. Hence V/(c) and 
V<?(c) must be parallel. That is, there must exist a scalar A such that 

V/(c) = AV/(c). (3.5.5) 

The idea now is that in looking for extreme values, we need only consider points c for 
which both g(c) = and V/(c) = Wg(c) for some scalar A. The scalar A is known as a 
Lagrange multiplier, and this method for finding extreme values subject to a constraining 
equation is known as the method of Lagrange multipliers. 

Example Suppose that the temperature at a point (x,y,z) on the unit sphere S = 
S 2 ((0,0,0),1) is given by 

T(x,y,z) = 30 + 5(x + z). 
To find the extreme values of T, we first define 

g(x,y,z) = x 2 + y 2 + z 2 - 1, 

thus making S the level surface of g specified by g(x, y, z) = 0. Now 

Vf(x,y,z) = (5,0,5) 

and 

Vg(x,y,z) = (2x,2y,2z). 
The candidates for the locations of extreme values will be solutions of the equations 



V/(x,y,z) = Xg(x,y,z), 
g(x,y,z) = 0, 
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that is, 

(5,0,5) = X(2x,2y,2z), 
x 2 + y 2 + z 2 - 1 = 0. 
Hence we need to solve the following system of four equation in four unknowns: 

5 = 2Xx, 

= 2Xy, 
5 = 2Xz, 

x 2 +y 2 + z 2 = 1. 

Now 5 = 2Xx implies that A ^ 0, and so = 2Xy implies that y = 0. Moreover, 5 = 2Xx 
and 5 = 2Xz imply that 2Xx = 2Xz, from which it follows, since A ^ 0, that x = z. 
Substituting these results into the final equation, we have 

1 = x 2 + y 2 + z 2 = x 2 + + x 2 = 2x 2 . 

Thus x = — ^ or x = and we have two solutions for our equations, 

n -—\ 

and 

1 1 

.V5 ' '71 

At this point, since T is continuous and <S is closed and bounded, we need only evaluate 
T at these points and compare their values. Now 



and 



T[ — -=,0, — 7= ) =30-5^ = 22.93 
V V2 V2J 

Tl 4=,0, 4= ) =30 + 5^ = 37.07, 



where the final values have been rounded to two decimal places, so the maximum tem- 
perature on the sphere is 37.07 at (i^|?0, -^j and the minimum temperature is 22.93 at 

(-*•*-*)• 

Example Suppose the farmer in our earlier example is faced with the opposite problem: 
Given 300 square meters of material, what are the dimensions of the rectangular bin, 
without a top, that holds the largest volume? If we again let x and y be the dimensions 
of the base of the bin and z be its height, then we want to maximize 



V = xyz 
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on the region where x > 0, y > 0, and z > 0, subject to the constraint that 

xy + 2xz + 2yz = 300. 

If we let 

g(x, y, z) = xy + Ixz + 2yz — 300, 
then our problem is to maximize V subject to the constraint g(x, y, z) = 0. Now 

W = {yz,xz,xy) 

and 

Vg(x, y, z) = (y + 2z, x + 2z, 2x + 2y), 
so the system of equations 



becomes the system 



W = XVg(x,y,z), 
g(x,y,z) = 0, 



yz = X(y + 2z), (3.5.6) 

xz = X(x + 2z), (3.5.7) 

xy = X(2x + 2y), (3.5.8) 

xy+2xz + 2yz = 300. (3.5.9) 



Equations (3.5.6) and (3.5.7) imply that 



A= VX 



and 



so 



that is, 



Hence 



y + 2z 



xz 

A 



x + 2z, 
yz xz 



y + 2z x + 2z ' 

y x 

y + 2z x + 2z 
xy + 2yz = xy + 2xz. 



Thus 2yz = 2xz, so x = y. Substituting this result into (3.5.8) gives us x 2 = 4Ax, from 
which it follows that x = 4A. Substituting into (3.5.7), we have 

4Az = A(4A + 2z) = 4A 2 + 2Az. 
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Hence 2Xz = 4A 2 , so z = 2A. Putting x = 4A, y = 4A, and z = 2A into (3.5.9) yields the 
equation 

16A 2 + 16A 2 + 16A 2 = 300. 

Thus 48A 2 = 300, so 



/300 /25 5 

A = ± V¥ = ± VT = ± 2' 

Now x, y, and z are all positive, so we must have A = |, giving us x = 10, y = 10, and 
z = 5. To show that we have the location of the maximum value of V, let 

S = {(x, y, z) : g(x, y, z) = 0, x > 0, y > 0, z > 0} 

and let D be that part of S for which 1 < x < 150, 1 < y < 150, and 1 < z < 150. Note 
that if (x, y, z) lies on S, then 

300 = xy + 2xz + 2yz 

and so xy < 300, xz < 150, and < 150. Moreover, 

300 - xy 



2x + 2y 



Now if either x > 150 or y > 150, then 



so 



If x < 1, 

and, similarly, if y < 1, 



300 

2£ 300 £1 ' 



y = xyz < (300)(1) = 300. 
V = xyz < (1)(150) = 150 



V = yxz < (1)(150) = 150. 
Thus if (x,y, z) is either on the boundary of D or outside of D, then V < 300. Since 

V \(x,y,z)=(10,W,5) = 500 ' 

it follows that the global maximum of V on S must occur inside D. In fact, this maximum 
value must be 500 cubic meters, occurring when x = 10 meters, y = 10 meters, and z = 5 
meters. 

Problems 

1. Find the maximum and minimum values of f(x,y) = xy on the set D = {(x,y) : 
x 2 + y 2 < 1}. 
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2. Find the maximum and minimum values of f(x,y) = 8 — x 2 — y 2 on the set D = 
{(x,y) :x 2 + 9y 2 < 9}. 

3. Find the maximum and minimum values of f(x,y) = x 2 + 3xy + y 2 on the set D = 
{(x,y) :x 2 + y 2 < 4}. 

4. Find all local extreme values of f(x,y) = xe~^ x +y \ 

5. Find all local extreme values of g(x,y) = x 2 e~( x2+y2 \ 

6. Find all local extreme values of g(x, y) = = ^ • 

1 + x z + y z 

7. Find all local extreme values of f(x, y) = Axy — 2x 2 — y 4 . 

8. Find all local extreme values of h(x, y) = 2x 4 + y 4 — x 2 — 2y 2 . 

9. Find all local extreme values of f(x,y,z) = x 2 + y 2 + z 2 . 

10. Find all local extreme values of g(x, y, z) = x 2 + y 2 — z 2 . 

11. A farmer wishes to build a rectangular bin, with a top, to hold a volume of 1000 cubic 
meters. Find the dimensions of the bin that will minimize the amount of material 
needed in its construction. 

12. A farmer wishes to build a rectangular bin, with a top, using 600 square meters of 
material. Find the dimensions of the bin that will maximize the volume. 

13. Find the extreme values of f(x, y, z) = x + y + z on the sphere with equation x 2 + y 2 + 
z 2 = 1. 

14. Find the minimum distance in M 2 from the origin to the line with equation 3x + 2y = 4. 

15. Find the minimum distance in M 3 from the origin to the plane with equation 2x + 4y + 
z = 6. 

16. Find the minimum distance in M 2 from the origin to the curve with equation xy = 1. 

17. The ellipsoid with equation x 2 + 2y 2 + z 2 = 4 is heated so that its temperature at 
(x, y, z) is given by T(x, y, z) = 70 + 10(x — z). Find the hottest and coldest points on 
the ellipsoid. 

18. Suppose an airline requires that the sum of the length, width, and height of carry-on 
luggage cannot exceed 45 inches (assuming the luggage is in the shape of a rectangular 
box). Find the dimensions of a piece of carry-on luggage that has the maximum 
volume. 

19. Let f(x,y) = (y - 4x 2 )(y - x 2 ). 

(a) Verify that (0,0) is a critical point of /. 

(b) Show that Hf(0, 0) is nondefinite. 

(c) Show that along any line through the origin, / has a local minimum at (0, 0). 

(d) Find a curve through the origin such that, along the curve, / has a local maximum 
at (0,0). Note that this shows that (0,0) is a saddle point. 
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20. Let f(x,y) = (x — y) 2 . Find all critical points of / and categorize them according 
as they are either saddle points or the location of local extreme values. Is the second 
derivative test useful in this case? 

21. Let g{x,y) = sin(x 2 + y 2 ). Find all critical points of g. Which critical points are the 
location of local maximums? Local minimums? Are there any saddle points? 

22. What does a plot of the gradient vectors look like around a saddle point of a function 
/ : M 2 — > M? You might look at some examples, like f(x,y) = x 2 — y 2 , f(x,y) = xy, 
or even f(x,y) = xye~^ x +y \ 

23. Given n points (xi,yi), (£2,2/2), • • • , (x n ,yn) in I^ 2 , the line with equation y = mx + b 
which minimizes 

n 

L(m, b) = ^(yi - {mxi + b)) 2 
i=i 

is called the least squares line. 

(a) Give a geometric interpretation for L(m,b). 

(b) Show that the parameters of the least squares line are 



m 



n ^2 XiVi 
i=i 

n 

n 
1=1 



K i=i J Vj=i 

/ n N 2 



j=i 



and 

b = y — mx, 

where 

1 n 

y = - V] Vi 



and 



n . , 
i=i 



1 n 

X = > Xi 

n ^ 

i=i 



24. The following table is taken from a report prepared in the 1960's to study the effect 
of leaks of radioactive waste from storage bins at the nuclear facilities at Hanford, 
Washington, on the cancer rates in nine Oregon counties which border the Columbia 
River. The table gives an index of exposure, which takes into account such things as 
distance from the Hanford facilities and the distance of the population from the river, 
along with the cancer mortality rate per 100,000 people. 
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County 


Index of Exposure 


Cancer Mort 


Umatilla 


2.49 


147.1 


Morrow 


2.57 


130.1 


Gilliam 


3.41 


129.9 


Sherman 


1.25 


113.5 


Wasco 


1.62 


137.5 


Hood River 


3.83 


162.3 


Portland 


11.64 


207.5 


Columbia 


6.41 


177.9 


Clatsop 


8.34 


210.3 



Using Problem 22, find the least squares line for this data (let the index of exposure be 
the x data). Plot the points along with the line. 



The Calculus of Functions 
of 

Several Variables 



We will first define the definite integral for a function / : R 2 — > R and later indicate how 
the definition may be extended to functions of three or more variables. 

Cartesian products 

We will find the following notation useful. Given two sets of real numbers A and B, we 
define the Cartesian product of A and B to be the set 

A x B = {(x, y) : x e A,y e B}. (3.6.1) 

For example, if A = {1, 2, 3} and B = {5, 6}, then 

AxB = {(1, 5), (1, 6), (2, 5), (2, 6), (3, 5), (3, 6)}. 

In particular, if a < b, c < d, A = [a, b], and B = [c, d], then A x B = [a,b] x [c, d] is the 
closed rectangle 

{(x, y) : a < x < b, c < y < d}, 

as shown in Figure 3.6.1. 



d 

c 
















a b 



Figure 3.6.1 The closed rectangle [a, b] x [c,d] 

More generally, given real numbers an < bi, i = 1, 2, 3, . . . , n, we may write 

[ai,&i] x [02,62] x ••• [a n ,b n ] 

1 Copyright © by Dan Sloughter 2001 
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for the closed rectangle 

{(x 1 ,x 2 ,...,x n ) : cti < Xi < bi,i = 1,2, ... ,n} 

and 

(ai,&i) x (02,62) x • • • (a n ,b n ) 

for the open rectangle 

{(xi,x 2 , ■ ■ ■ , x n ) : a,i < Xi < hi, % = 1, 2, . . . , n}. 

Definite integrals on rectangles 

Given a < b and c < d, let 

D = [a, b] x [c, d] 

and suppose / : M 2 — >■ K is defined on all of D. Moreover, we suppose / is bounded on 
D, that is, there exist constants m and M such that m < f(x,y) < M for all (x,y) in 
D. In particular, the Extreme Value Theorem implies that / is bounded on D if / is 
continuous on D. Our definition of the definite integral of / over the rectangle D will 
follow the definition from one- variable calculus. Given positive integers m and n, we let P 
be a partition of [a, b] into m intervals, that is, a set P = {xq, x\, . . . , x m } where 

a = xq < x\ < ■ ■ ■ < x m = b, (3.6.2) 

and we let Q be a partition of [c,d] into n intervals, that is, a set Q = {yo,yi, ■ ■ ■ ,y n } 
where 

a = yo < yi < ■ ■ ■ < y n = b. (3.6.3) 
We will let P x Q denote the partition of D into ran rectangles 

Dij = [xi-uXi] x [yj-i,yj], (3.6.4) 
where i = 1, 2, . . . , m and j = 1, 2, . . . , n. Note that has area AxiAyj, where 

Axi = Xi- Xi-i (3.6.5) 

and 

A yj=Vj-yj-i- (3.6.6) 
An example is shown in Figure 3.6.2. 
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Figure 3.6.2 A partition of a rectangle [a, b] x [c, d] 
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Now let rriij be the largest real number with the property that rriij < /(x, y) for all 
(x, y) in Dij and My be the smallest real number with the property that f(x, y) < My for 
all (x, y) in D^. Note that if / is continuous on D, then rriij is simply the minimum value of 
/ on and M^- is the maximum value of / on Dy , both of which are guaranteed to exist 
by the Extreme Value Theorem. If / is not continuous, our assumption that / is bounded 
nevertheless guarantees the existence of the rriij and M^, although the justification for 
this statement lies beyond the scope of this book. 

We may now define the lower sum, L(f,P x Q), for / with respect to the partition 
P x Q by " 

m n 

L(f, PxQ) = ^2Yl m ii Ax iVj ( 3 - 6 - 7 ) 

z=l j=l 

and the upper sum, U(f, P x Q), for / with respect to the partition P x Q by 

*/(/, PxQ) = J^ MijAxiyj. (3.6.8) 
i=i j=i 

Geometrically, if f(x, y) > for all (x, y) in D and F is the volume of the region which 
lies beneath the graph of / and above the rectangle D, then L(f, P x Q) and U(f,Px Q) 
represent lower and upper bounds, respectively, for V. (See Figure 3.6.3 for an example 
of one term of a lower sum). Moreover, we should expect that these bounds can be made 
arbitrarily close to V using sufficiently fine partitions P and Q. In part this implies that we 
may characterize V as the only real number which lies between L(/, PxQ) and U(f,Px Q) 
for all choices of partitions P and Q. This is the basis for the following definition. 

Definition Suppose / : M 2 — > K is bounded on the rectangle D = [a,b] x [c,d]. With 
the notation as above, we say / is integrable on D if there exists a unique real number / 
such that 

L(f,PxQ)<I<U(f,PxQ) (3.6.9) 

for all partitions P of [a, b] and Q of [c,d\. If / is integrable on D, we call / the definite 
integral of / on D, which we denote 

I = j J f(x,y)dxdy. (3.6.10) 

Geometrically, if f(x,y) > for all (x, y) in D, we may think of the definite integral 
of / on D as the volume of the region in R 3 which lies beneath the graph of / and above 
the rectangle D. Other interpretations include total mass of the rectangle D (if f(x,y) 
represents the density of mass at the point (x,y)) and total electric charge of the rectangle 
D (if f(x,y) represents the charge density at the point (x, y)). 

Example Suppose /(x, y) = x 2 + y 2 and D = [0, 1] x [0, 3]. If we let 
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and 

Q = {0,1,2,3}, 

then the minimum value of / on each rectangle of the partition P x Q occurs at the lower 
left-hand corner of the rectangle and the maximum value of / occurs at the upper right- 
hand corner of the rectangle. See Figure 3.6.3 for a picture of one term of the lower sum. 
Hence 

L(/,PxQ)=/(0,0)x i x i + /Q,oj x i x l + /(0,l)x i x 1 

+ /(i,l)xixl + /(0,2)xixl + /(I,2)xixl 

= +I + ± + 5 + 2 + IZ 

8 2 8 8 
- f - 5.375 



and 



U(f, P x Q) = f Q, l) x i x 1 + /(l, 1) x i x 1 + / Q, 2^ x ± x 1 
+ /(1, 2) x ~ x 1 + / Q, 3] x - x 1 + /(l, 3) x - x 1 
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5 1 17 5 37 r 

= 8 + 1 + y + 2 + y +5 

= ™ = 15.875. 

8 

We will see below that the continuity of / implies that / is integrable onD, so we may 
conclude that 

5.375 < / (x 2 +y 2 )dxdy < 15.875. 



Example Suppose k is a constant and f(x,y) = k for all (x,y) in the rectangle D = 
[a, b] x [c, d]. The for any partitions P = {xq, x\,..., x m } of [a, b] and Q = {yo,yi, ■ ■ ■ , y n } 
of [c, d], rriij = k = Mij for z = 1, 2, . . . , m and j = 1, 2, . . . , n. Hence 

L(f,PxQ) = U(f,PxQ) 

m n 

i=l j = l 
m 71 

i=l 3 = 1 

= k x (area of £)) 
= k(b- o){d-c). 

Hence / is integrable and 

J j f(x,y)dxdy = J J kdxdy = k(b — a)(d — c). 

Of course, geometrically this result is saying that the volume of a box with height k and 
base D is k times the area of D. In particular, if k = 1 we see that 



dxdy = area of D. 

D 

Example If D = [1,2] x [—1,3], then 

hdxdy = 5(2- 1)(3 + 1) = 20. 



The properties of the definite integral stated in the following proposition follow easily 
from the definition, although we will omit the somewhat technical details. 

Proposition Suppose / : M 2 — > R and g : R 2 — > R are both integrable on the rectangle 
D = [a,b] x [c, d] and is a scalar constant. Then 

{f(x,y) + g(x,y))dxdy= / f(x,y)dxdy+ / g(x,y)dxdy, (3.6.11) 
D J Jd J Jd 
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J J kf(x,y)dxdy = kj J f(x,y)dxdy, (3.6.12) 
and, if f(x,y) < g(x,y) for all (x,y) in D, 

J J f(x,y)dxdy< J J g(x,y)dxdy. (3.6.13) 

Our definition does not provide a practical method for determining whether a given 
function is integrable or not. A complete characterization of integrability is beyond the 
scope of this text, but we shall find one simple condition very useful: if / is continuous 
on an open set containing the rectangle D, then / is integrable on D. Although we will 
not attempt a full proof of this result, the outline is as follows. If / is continuous on 
D = [a, b] x [c, d] and we are given any e > 0, then it is possible to find partitions P of 
[a, b] and Q of [c, d] sufficiently fine to guarantee that if (x, y) and (u, v) are points in the 
same rectangle Dij of the partition P x Q of D, then 

l/('.y)-/(".»)l< (t _ a) ' ( „_ c) . (s-e-") 

(Note that this is not a direct consequence of the continuity of /, but follows from a 
slightly deeper property of continuous functions on closed bounded sets known as uniform 
continuity.) It follows that if is the minimum value and Mij is the maximum value of 
/ on D{j, then 

m n m n 

U(f, PxQ)- L(f, PxQ) = ^2Yl MijAxiAyj m ^ Ax ^ A Vj 

i—1 j—1 i—1 3 = 1 

m n 

i=i j=i 

m n 

<EE (5 _ a) ( rf _c) A ^ (3.6.15) 

m n 

= — ^— Ab - a)(d - c) 

(b — a){d — c) 

= e. 

It now follows that we may find upper and lower sums which are arbitrarily close, from 
which follows the integrability of /. 

Theorem If / is continuous on an open set containing the rectangle D, then / is inte- 
grable on D. 




Example If f(x,y) = x 2 + y 2 , then / is continuous on all of R 2 . Hence / is integrable 
on D = [0,1] x [0,3]. 

Iterated integrals 

Now suppose we have a rectangle D = [a, b] x [c, d] and a continuous function / : M 2 — > R 
such that f(x, y) > for all (x, y) in D. Let 

B = {{x, y, z) : (x, y) E D, < z < f(x, y)}. (3.6.16) 

Then B is the region in R 3 bounded below by D and above by the graph of /. If we let V 
be the volume of B, then 

V = J J f(x,y)dxdy. (3.6.17) 
However, there is another approach to finding V. If, for every c < y < d, we let 

«(y) = / f(x,y)dx, (3.6.18) 

J a 

then a(y) is the area of a slice of B cut by a plane orthogonal to both the a;y-plane and 
the yz-plane and passing through the point (0, y, 0) on the y-axis (see Figure 3.6.4 for an 
example). If we let the partition Q = {yo, yi, . . . , y n } divide [c, d] into n intervals of equal 
length Ay, then we may approximate V by 

71 

J>(y,)Ay. (3.6.19) 

j'=i 
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That is, we may approximate V by slicing B into slabs of thickness Ay perpendicular to the 
yz-plane, and then summing approximations to the volume of each slab. As n increases, 
this approximation should converge to V; at the same time, since (3.6.19) is a right-hand 
rule approximation to the definite integral of a over [c, d] , the sum should converge to 



d 

a{y)dy 



as n increases. That is, we should have 



n pd pd / pb \ 

v = }^J2 a ^ A y = J a (y) d y = j f{x,y)dxjd y . (3.6.20) 

Note that the expression on the right-hand side of (3.6.20) is not the definite integral of / 
over D, but rather two successive integrals of one variable. Also, we could have reversed 
our order and first integrated with respect to y and then integrated the result with respect 
to x. 

Definition Suppose / : M 2 — > K is defined on a rectangle D = [a, b] x [c, d]. The iterated 
integrals of / over D are 



f f(x,y)dxdy= [ (f f(x,y)dx)dy (3.6.21) 

J a J c \ J a I 



and 



nb nd /'b I pd \ 

/ / f(x,y)dydx= / f(x,y)dy\dx. (3.6.22) 

J a J c J a \J c J 

In the situation of the preceding paragraph, we should expect the iterated integrals in 
(3.6.21) and (3.6.22) to be equal since they should both equal V, the volume of the region 
B. Moreover, since we also know that 

V = J J f(x,y)dxdy, 

the iterated integrals should both be equal to the definite integral of / over D. These 
statements may in fact be verified as long as / is integrable on D and the iterated integrals 
exist. In this case, iterated integrals provide a method of evaluating double integrals in 
terms of integrals of a single variable (for which we may use the Fundamental Theorem of 
Calculus). 




Fubini's Theorem (for rectangles) Suppose / is integrable over the rectangle D = 
[a, b] x [c, d]. If 

rd rb 

/ / f{x,y)dxdy 

J C J a 

exists, then 

f(x,y)dxdy= / / f(x,y)dxdy. (3.6.23) 



If 



exists, then 



D 

b r d 



f(x,y)dydx 



f(x,y)dxdy= / f(x,y)dxdy. (3.6.24) 

J a J c 



<D 



Example To find the volume V of the region beneath the graph of f(x,y) = x 2 + y 2 
and over the rectangle D = [0, 1] x [0, 3] (as shown in Figure 3.6.5), we compute 



V = J j {x 2 + y 2 )dxdy 

= 11 {x 2 + y 2 )dxdy 
Jo Jo 
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3 'x 3 



+ xy' 



dy 



o 



1 



+ y 2 )dy 



= (v,y_ 

\3 3 
= 1 + 9 
= 10. 



We could also compute the iterated integral in the other order: 



V = J J (x 2 + y 2 )dxdy 

= 11 (x 2 + y 2 )dydx 
Jo Jo 



! x 2 v 3 

x y + y 



<ix 



= f (3x 2 + 9)rfx 

= (x s + 9y)|J 
= 1 + 9 
= 10. 



Example If D = [1,2] x [0, 1], then 



/ / x 2 ydxdy = / x 2 ydydx = / X ^ dx = 
J Jd Ji Jo Ji 2 7i 



x 2 , X 3 

— ax = — 
2 6 



8 1 _ 7 
6 6 6 



Definite integrals on other regions 

Integrals over intervals suffice for most applications of functions of a single variable. How- 
ever, for functions of two variables it is important to consider integrals on regions other 
than rectangles. To extend our definition, consider a function / : M 2 — > M defined on a 
bounded region D. Let D* be a rectangle containing D and, for any (x,y) in D*, define 

A.,,) = {«*■»>■ ZlyHn. <3 ' 6 - 25) 



In other words, /* is identical to / on D and at all points of D* outside of D. Now if 
/* is integrable on D* , and since the the region where /* is should contribute nothing 
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to the value of the integral, it is reasonable to define the integral of / over D to be equal 
to the integral of /* over D* . 

Definition Suppose / is defined on a bounded region D of R 2 and let D* be any rectangle 
containing D. Define /* as in (3.6.25). We say / is integrable on D if /* is integrable on 
D* , in which case we define 

J J f{x, y)dxdy = J f f*{x, V)dxdy. (3.6.26) 

Note that the integr ability of / on a region D depends not only on the nature of /, but 
on the region D as well. In particular, even if / is continuous on an open set containing D, 
it may still turn out that / is not integrable on D because of the complicated nature of the 
boundary of D. Fortunately, there are two basic types of regions which occur frequently 
and to which our previous theorems generalize. 

Definition We say a region D in R 2 is of Type I if there exist real numbers a < b and 
continuous functions a : R — > R and ft : R — > R such that a(x) < ft(x) for all x in [a, b] 
and 

D = {{x, y) : a < x < b, a(x) < y < ft(x)}. (3.6.27) 

We say a region D in R 2 is of Type II if there exist real numbers c < d and continuous 
functions 7 : R — > R and 5 : R — > R such that j(y) < S(y) for all y in [c, d] and 

D = {(x, y) : c < y < d, 7 (y) < x < S(y)}. (3.6.28) 

Figure 3.6.6 shows typical examples of regions of Type I and Type II. 
Example If D is the triangle with vertices at (0,0), (1,0), and (1, 1), then 

D = {(x,y) : < x < 1,0 < y < x}. 
Hence D is a Type I region with a(x) = and ft(x) = x. Note that we also have 

D = {(x,y) : < y < l,y < x < 1}, 
so D is also a Type II region with j(y) = y and S(y) = 1. See Figure 3.6.7. 
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0.2 0.4 0.6 0.8 1 



Figure 3.6.7 Two regions which are of both Type I and Type II 
Example The closed disk 

D = {(x,y) :x 2 + y 2 < 1} 

is both a region of Type I, with 

D = {{x,y) : -1 < x < 1,-y/l - x 2 < y < y/l -x 2 }, 
and a region of Type II, with 

D = {(x,y) :-l<y< - y 2 <x< \fl -x 2 }. 

See Figure 3.6.7. 

Example Let D be the region which lies beneath the graph of y = x 2 and above the 
interval [—1,1] on the x-axis. Then 

D = {(x,y) : -1 < x < 1,0 < y < x 2 }, 

so D is a region of Type I. However, D is not a region of Type II. See Figure 3.6.8. 

Theorem If D is a region of Type I or a region of Type II and / : M 2 — > M. is continuous 
on an open set containing D, then / is integrable on D. 

Fubini's Theorem (for regions of Type I and Type II) Suppose / : M 2 — > R is 
integrable on the region D. If D is a region of Type I with 

D = {(x, y) : a < x < b, a(x) < y < (3(x)} 

and the iterated integral 

,6 ,/3(x) 

/ / f(x,y)dydx 

J a J a(x) 
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0.5 



-1.5 -1 -0.5 0.5 1 1.5 

Figure 3.6.8 A region which is of Type I but not of Type II 



exists, then 

r r r b rP{ x ) 

/ / f(x,y)dxdy= / / f(x,y)dydx. 

J JD Ja Ja(x) 

If D is a region of Type II with 

D = {( x , y) ■ c <y <d, j(y) <x< 5{y)} 

and the iterated integral 



d r S(y) 



c J~i{y) 



f(x,y)dxdy 



exists, then 



rd r5(y) 

f(x,y)dxdy= / / f(x,y)dydx. 

D Jc J 1 (y) 



(3.6.29) 



(3.6.30) 



Example Let D be the triangle with vertices at (0, 0), (1,0), and (1, 1), as in the example 
above. Expressing D as a region of Type I, we have 



1 2 

xy z 



xydxdy = I / xydydx = 
d Jo Jo Jo 2 



dx 



JO 



ar , x q 
— ax = — 
2 8 



1 



Since D is also a region of Type II, we may evaluate the integral in the other order as well, 
obtaining 



xydxdy 



' D 



1 r l 



Jy 



xydxdy 



dy 



1 / 3 

/ y__y_ 

2 2 



dy 



y^_y^ 

4 8 



1 
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0.2 0.4 0.6 0.8 1 
Figure 3.6.9 The region D = {(x, y) : < x < 1, a/x < y < 1} 



In the last example the choice of integration was not too important, with the first order 
being perhaps slightly easier than the second. However, there are times when the choice 
of the order of integration has a significant effect on the ease of integration. 

Example Let D = {(x, y) : < x < 1, ^fx < y < 1} (see Figure 3.6.9). Since D is both 
of Type I and of Type II, we may evaluate 



e y dxdy 



either as 



or as 



D 

Vr 

J \fx 

i rv 2 



J 



dydx 



dxdy. 



The first of these two iterated integrals requires integrating g(y) = e y ; however, we may 
evaluate the second easily: 



dxdy 



D 



i rv 



o J o 
l 

xe~ 



dxdy 



dy 



2 *-v dy 



y e 
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Figure 3.6.10 Region bounded by z = 4 — x — y and the xy-plane 



Example Let V be the volume of the region lying below the paraboloid P with equation 
z = 4 — x 2 — y 2 and above the xy-plane (see Figure 3.6.10). Since the surface P intersects 
the xy-plane when 



that is, when 



4 - x 2 - y 2 = 0, 



x 2 + y 2 = 4, 



V is the volume of the region bounded above by the graph of f(x, y) = 4 — x 2 — y 2 and 
below by the region 

D = {(x,y):x 2 + y 2 <4}. 
If we describe D as a Type I region, namely, 

D = {(x,y) : -2 < x < 2,-\/4-x 2 < y < ^/4-x 2 }, 

then we may compute 



V 



(4 — x — y )dxdy 

D 

2 /V4-0; 2 

2 „.2\ 



-2 J-v^ 

2 / 3 

4y - x 2 y - y 



(4 — x — y )dydx 



V4- 



dx 
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= J 2 (8y/4 - x 2 - 2xV4-x 2 - ?(4 - x 2 )^j dx 
= 2 J {{A - x 2 )V*-x 2 - * (4 - x 2 )^ cix 
= A -J 2 (A-x 2 )Ux. 

Using the substitution x = 2sin(6>), we have dx = 2cos(9)d9, and so 

V = \ f {A-x 2 )idx 

3 J -2 

= - f 2 (4-4sin 2 (6>))5 2cos(#)d# 
3 7_ f 

64 /"5 
= y / cos 4 {9)d0 

7T 

= — f 2 (l + 2cos(26>) + cos 2 (2fl))d0 

3 y_ f 

'«|!, +S in( M )|? f+ £l±^, i() ) 
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16 

y 



16 / e 

16 / 7T\ 

= T(" + 2) 
= 8tt. 



+ - sin(4#) 



Integrals of functions of three or more variables 

We will now sketch how to extend the definition of the definite integral to higher dimen- 
sions. Suppose / : M. n — > M is bounded on an n-dimensional closed rectangle 

D = [ai,&i] x [02,62] x ••• [a n ,b n \. 

Let Pi, P2, . . . , P n partition the intervals [ai, 61], [02, 62], • • • , K, 6 n ] into mi, m2, • • • , 
m n , respectively, intervals, and let Pi x P 2 x • • • x P n represent the corresponding partition 
of D into mim-2 • • • m n n-dimensional closed rectangles Di 1 i 2 ...i ri . If rrii 1 i 2 ...i ri is the largest 
real number such that m^^...^ < /(x) for all x in D^...^ and M^...^ is the smallest 
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real number such that /(x) < M^...^ for all x in D^...^, then we may define the lower 



sum 



L(/,Pi x P 2 x ••• x P n ) = ^ X/" £ m ni 2 ---i„ Aa; iii Aa; 2 l2 • • • Ax nill (3.6.31) 



11=1 12 in = l 

and the upper sum 

mi 7712 771^ 



[/(/,Pi xP 2 x---xP n ) = J] M ilia ... in Axi il Ax 2i2 ---Aa; nin) (3.6.32) 

where Axjfc is the length of the fcth interval of the partition Pj. We then say / is integrable 
on D if there exists a unique real number / with the property that 

L(f, P 1 xP 2 x-.-xP n )<I< U(f, Pi x P 2 x • • • x P n ) (3.6.33) 

for all choices of partitions P\ , P 2 , . . . , P ra and we write 

/ = j ■■■ J J f(x 1 ,x 2 ,...,x n )dx 1 dx 2 ---dx n , (3.6.34) 



or 



I = J"'J // (x)dX ' (3 ' 6 ' 35) 

for the definite integral of / on D. 

We may now generalize the definition of the integral to more general regions in the 
same manner as above. Moreover, our integrability theorem and Fubini's theorem, with 
appropriate changes, hold as well. When n = 3, we may interpret 

J J J f(x,y, z)dxdydz (3.6.36) 

to be the total mass of D if f(x,y,z) represents the density of mass at (x,y,z), or the 
total electric charge of D if f(x, y, z) represents the electric charge density at (x, y, z). For 
any value of n we may interpret 

J " ' j j dx\dx 2 ■ ■ ■ dx n (3.6.37) 

to be the n-dimensional volume of D. We will not go into further details, preferring to 
illustrate with examples. 

Example Suppose D is the closed rectangle 

D = {(x,y,z,t) : < x < 1,-1 < y < 1,-2 < z < 2,0 < t < 2} 
= [0,1] x [-1,1] x [-2,2] x [0,2]. 
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Then 



f [ [ [ (x 2 + y 2 + z 2 - t 2 )dxdydzdt = [ [ [ [ (x 2 + y 2 + z 2 - t 2 )dtdzdydx 

J J J J D JO J -I J -2 JO 



1 r l r 2 







J_lJ_2 
1 />! r 2 



x 2 t + y 2 t + z 2 t 



dzdydx 



J_lJ_2 
1 />! 



2x z + 1y l + 2z z - - ) dzdxdy 



o J -i 

1 r l 



2 2 2z rf 8z 

2x 2 z + 2y 2 z H 

y 3 3 



32 32 



dydx 



8x z + 8y z + — - — ) dydx 
8x 2 y + dx 



1 fi 

16x 2 + — ) dx 



Wx 3 16x x 1 
+ IT 



o 



32 
~3~' 



Example Let D be the region in R 3 bounded by the the three coordinate planes and 
the plane P with equation z = 1 — x — y (see Figure 3.6.11). Suppose we wish to evaluate 

xyzdxdydz. 




D 

Note that the side of D which lies in the xy-plane, that is, the plane z = 0, is a triangle 
with vertices at (0, 0, 0), (1, 0, 0), and (0, 1, 0). Or, strictly in terms of x and y coordinates, 
we may describe this face as the triangle in the first quadrant bounded by the line y = 1 — x 
(see Figure 3.6.11). Hence x varies from to 1, and, for each value of x, y varies from 
to 1 — x. Finally, once we have fixed a values for x and y, z varies from up to P, that is, 
to 1 — x — y. Hence we have 

r-l r-l-X r-l-X-y 

xyzdxdydz = / xyzdzdydx 

d Jo Jo Jo 




1 <- l - x xyz 2 



SI 

Jo Jo 



1—x—y 

dydx 

o 



2 

1 -' J x xy(l-x-y) 2 



^0 



dydx 
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Example Let V be the volume of the region D in M 3 bounded by the paraboloids with 
equations z = 10 — x 2 — y 2 and z = x 2 + y 2 — 8 (see Figure 3.6.12). We will find V by 
evaluating 

V = J J J dxdydz. 

To set up an iterated integral, we first note that the paraboloid z = 10 — x 2 — y 2 opens 
downward about the z-axis and the paraboloid z = x 2 + y 2 — 8 opens upward about the z 
axis. The two paraboloids intersect when 



10 - x 2 - y 2 = x 2 + y 2 - 8, 



20 Definite Integrals Section 3.6 



X 




Figure 3.6.12 Region bounded by z = 10 — x 2 — y 2 and z = x 2 + y 2 — 8 
that is, when 

x 2 + y 2 = 9. 

Now we may describe the region in the ccy-plane described by x 2 + y 2 < 9 as the set of 
points (x,y) for which — 3 < x < 3 and, for every such fixed x, 

— \/3 — x 2 < y < \JZ — x 2 . 



Moreover, once we have fixed x and y so that (x, y) is inside the circle x 2 + y 2 = 9, then 
(x, y, z) is in D provided x 2 + y 2 — 8 < z < 10 — x 2 — y 2 . Hence we have 
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V 




dxdydz 



D 



3 r V9-x 2 r 10-x 




L 



3 J — \/9— x 2 J x 2 +y 
3 r V9=~a 



dzdydx 




3 J-VO 1 
3 r V9^ 



I 10—x—y j 
Ax 2 +y 2 -S d V dX 




18y - Ix'y 

3 \ <J 



(18 - 2a; 2 - 2y 2 )dydx 
2y3 x V9^ 



dx 



V9- 



^36a/9 - x 2 - 4xV9-x 2 - ^(9 - x 2 )^ dx 
= J ^9 - x 2 ^36 - 4x 2 - ^(9 - x 2 )^) 

Q ^3 

= -y (9-X 2 )5rfx. 

Using the substitution x = 3sin(0), we have dx = 3cos(9)d0, and so 
y = ? [ 3 (9-x 2 )'Ux 

3 7-3 

= - / 2 (9-9sin 2 (x))5(3cos(^))^ 
3 7_ f 

= 216 / 2 cos 4 (0)d0 

7T 

= 54 y 2 (1 + 2 cos(20) + cos 2 (20))d0 



= 54 0|», + sin(20)|* E + 



5 1 + cos(40) 



d0 



7T 27 

= 54tt + 27#K + — sin(40)| 
1 2 4 

= 817T. 
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1. Evaluate each of the following iterated integrals. 

r>3 r 2 



(a) / / 3xy 2 dydx (b) / / 4xsin(x + y)dydx 

Ji Jo Jo Jo 

/2 n 1 /»2 /» 1 

/ (4- x 2 y 2 )dxdy (d) / / e x+y dxdy 

-2 7-i 7o 7o 

2. Evaluate the following definite integrals over the given rectangles. 

(a) J j D (y 2 - ^y)dxdy, D = [0, 2] x [0, 1] (b) J * dxdy, D = [1, 2] x [1, 3] 

(c) J J ye~ x dxdy, D = [0, 1] x [0, 2] (d) J J dxdy, D = [1, 2] x [0, 1] 

3. For each of the following, evaluate the iterated integrals and sketch the region of 
integration. 

r2 ry r\ rx^ 

(a) / / (xy 2 - x 2 )dxdy (b) / / (x 2 + y 2 )dydx 

Jo Jo Jo Jx 4 

r2 r>^/4 — x 2 rl r>y 2 

(c) / / (4 - x 2 - y 2 )dydx (d) / / xye~ x ~ y dxdy 

Jo Jo Jo Jo 

4. Find the volume of the region beneath the graph of f(x, y) = 2 + x 2 + y 2 and above 
the rectangle D = [-1, 1] x [-2, 2]. 

5. Find the volume of the region beneath the graph of f(x, y) = 4 — x 2 + y 2 and above 
the region D = {(x, y) : < x < 2, —x < y < x}. Sketch the region D. 



6. Evaluate / / xydxdy, where D is the region bounded by the x-axis, the y-axis, and 
the line y = 2 — x. 

7. Evaluate J J e~°° 2 dxdy where D = {(x, y) : < y < 1, y < x < 1}. 

8. Find the volume of the region in M 3 described by x > 0, y > 0, and < z < A — 2y — Ax. 

9. Find the volume of the region in M 3 lying above the xy-plane and below the surface 
with equation z = 16 — x 2 — y 2 . 

10. Find the volume of the region in M 3 lying above the xy-plane and below the surface 
with equation z = 4 — 2x 2 — y 2 . 

11. Evaluate each of the following iterated integrals. 

r-2 r 3 r 2 r 3 p2 f-2 

3xyzdxdydz 



(a) / / / (4 - x 2 - z 2 )dydxdz (b) / / / 

7i Jo 7-2 7-2 7-i Jo 

p4 px px+y pi px px+y px+y+z 

(c) / / / (x 2 — yz)dzdydx (d) / / / / wdwdzdydx 
Jo Jo Jo Jo Jo Jo Jo 
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12. Find the volume of the region in M 3 bounded by the paraboloids with equations z = 
3 — x 2 — y 2 and z = x 2 + y 2 — 5. 



13. Evaluate J J J xydxdydz, where D is the region bounded by the xy-plane, the yz- 
plane, the xz-plane, and the plane with equation z = 4 — x — y. 

14. If f(x, y, z) represents the density of mass at the point (x, y, z) of an object occupying 
a region D in M 3 , then 

/ f(x,y,z)dxdydz 
' D 



is the total mass of the object and the point (x,y, z), where 

x= — f f f xf(x,y,z)dxdydz, 
m J J Jd 

^ = m J J J yf( x > y > z } dxdydz > 



and 



z = mJ J J z f( x >yi z ) dxd y dz > 



is called the center of mass of the object. Suppose D is the region bounded by the 
planes x = 0, y = 0, z = 0, and z = 4 — x — 2y. 

(a) Find the total mass and center of mass for an object occupying the region D with 
mass density given by f(x,y,z) = 1. 

(b) Find the total mass and center of mass for an object occupying the region D with 
mass density given by f(x,y,z) = z. 

15. If X and Y are points chosen at random from the interval [0, 1], then the probability 
that (X, Y) lies in a subset D of the unit square [0, 1] x [0, 1] is j j dxdy. 

(a) Find the probability that X < Y. 

(b) Find the probability that X + Y < 1. 

(c) Find the probability that XY > 

16. If X, Y, and Z are points chosen at random from the interval [0, 1], then the probability 
that (X, Y, Z) lies in a subset D of the unit cube [0, 1] x [0, 1] x [0, 1] is j j j dxdydz. 

(a) Find the probability that X < Y < Z. 

(b) Find the probability that X + Y + Z < 1. 
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Several Variables 



v 



J 



One of the basic techniques for evaluating an integral in one- variable calculus is substitu- 
tion, replacing one variable with another in such a way that the resulting integral is of a 
simpler form. Although slightly more subtle in the case of two or more variables, a similar 
idea provides a powerful technique for evaluating definite integrals. 

Linear change of variables 

We will present the main idea through an example. Let 



the region inside the ellipse which intersects the x-axis at (—2, 0) and (2, 0) and the y-axis 
at (0, —3) and (0, 3). To find the area of D, we evaluate 



where the final integral may be evaluated using the substitution x = 2sin(#) or by noting 
that 



J-1 

is one-half of the area of a circle of radius 2. Alternatively, suppose we write the equation 
of the ellipse as 



D = {(x,y) : 9x 2 + 4y 2 < 36}, 






1 



and make the substitution x 
point in D, then 



2u and y = 3v 



Then u 



| and v 



— y 



so if (x, y) is a 



3 - 




That is, if (x, y) is a point in D, then (u, v) is a point in the unit disk 

E = {(u,v) :u 2 + v 2 < 1}. 



Conversely, if (u, v) is a point in E, then 




1 
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Figure 3.7.1 F maps E onto D 



so (x,y) is a point in D. Thus the function F(u,v) = (2u,3v) takes the region E, a 
closed disk of radius 1, and stretches it onto the region D (as shown in Figure 3.7.1). 
However, note that even though every point in E corresponds to exactly one point in D, 
and, conversely, every point in D corresponds to exactly on point in E, nevertheless E and 
D do not have the same area. To see how F changes area, consider what it does to the 
unit square S with sides ei = (1,0) and e2 = (0, 1). The area of S is 1, but F maps S 
onto a rectangle R with sides 

F(l,0) = (2,0) 

and 

F(0,l) = (0,3) 

and area 6. This a special case of a general fact we saw in Section 1.6: the linear function 
F, with associated matrix 

M=\ 2 °" 
[0 3 

maps the unit square S onto a parallelogram R with area 

|det(M)| = 6. 

The important fact for us here is that 1 unit of area in the uv-plane corresponds to 6 units 
of area in the xy-plane. Hence the area of D will be 6 times the area of E. That is, 

dxdy = / / det(M)\dudv = / / 6dudv = 6 / / dudv = 6n, 
J Je J Je J Je 



D 



where the final integral is simply the area inside a circle of radius 1. 
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These ideas provide the background for a proof of the following theorem. 

Theorem Suppose / : R n — > R is continuous on a an open set U containing the closed 
bounded set D. Suppose F : R n — > R n is a linear function, M is an n x n matrix such that 
F(u) = Mu, and det(M) ^ 0. If F maps the region E onto the region D and we define 
the change of variables 

















= M 




- %n - 




-U n _ 



then 



f(x!,x 2 , • • ■ , x n )dx 1 dx 2 ■ ■ ■ dx n 



D 



l l l ^^( Ul ' U2 '---' Un ))l det ( M )l rfUldM2 '" dUn - 



(3.7.1] 



Example Let D be the region in R bounded by the ellipsoid with equation 

1. 

See Figure 3.7.2. If we make the change of variables x = 2u, y = Av, and z = 3w, that is, 



x 2 y 2 z 2 

V - — I 

4 16 9 



X 




"2 





0" 




u 


y 







4 







V 


z 










3 




w 
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then, for any (x, y, z) in D, we have 



That is, if (x,y,z) lies in D, then the corresponding (u,v,w) lies in the closed unit ball 
E = -B 3 ((0, 0, 0), 1). Conversely, if (u, v, w) lies in E, then 

x 2 y 2 z 2 4u 2 lGv 2 9w 2 9 9 9 

\- - — I = 1 1 = u 2 + v 2 + w 2 < 1, 

4 16 9 4 16 9 

so (x, y, z) lies in £). Hence, the change of variables F(u, v, w) = (2u, Av, 3w) maps E onto 
D. Now 

'2 

det 4 = 24. 
3 



so if V is the volume of D, then 

1 // ///' 



24dudvdw 



E 



j j S E 



dudvdw = 24 



47T 



32vr, 



where we have used the fact that the volume of a sphere of radius 1 is to evaluate the 
final integral. 



Nonlinear change of variables 

Without going into the technical details, we will indicate how to proceed when the change 
of variables is not linear. Suppose / : 1" — > R is continuous on a an open set U containing 
the closed bounded set D and F : R n — > R n maps a closed bounded region E of R n 
onto D so that every point of D corresponds to exactly one point of E. Writing F(u) = 
(-Fi(u), i*2(u), . . . , F n (u)), we will assume that Fi, F2, ■ ■ ■ , and F n are all differentiable 
on an open set W containing E. Although we will not study this type of function until 
Chapter 4, the natural candidate for the derivative of F is the matrix whose ith row is 
VFj(u). Letting X{ = Fi(u\,U2, ■ . • , u n ), i = 1, 2, . . . , n, we denote this matrix, called the 
Jacobian matrix of F, 

d(xi,x 2 , ...,x n ) 



d(ui,u 2 ,...,u n )' 



(3.7.2) 



Explicitly, 



d(xi,x 2 , ...,x n ) 
d(u 1 ,u 2 , ...,u n ) 



d 
d 

dui 

d 

dui 



F 1 (u) 
F 2 (u) 

F n (u) 



d 

du 2 

d 

du 2 

d 

du 2 



F 1 (u) 
F 2 (u) 

F n (u) 



d 
d 



F 1 (u) 
F 2 (u) 



_d_ 



F n (u) 



(3.7.3) 
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We shall see in Chapter 4 that 



d(x 1 ,x 2 , ...,x n ) 



d(u 1 ,u 2 , ...,u n ) 

is the matrix for the linear part of the best afhne approximation to F at (ui, u 2 , . . . ,u n ). 
Hence, for sufficiently small rectangles, the factor by which F changes the area of a rect- 
angle when it maps it to a region will be approximately 

d(x 1 ,x 2 , ...,x n ) 



det 



d{u 1 ,u 2 , ■ ■ ■ ,u n ) 
One may then show that, analogous to (3.7.1), we have 



(3.7.4) 



/ 



f(xt,x 2 , . . .,x n )dx 1 dx 2 ■ ■ ■ dx n 



D 



= J"J J f( F (ui,u 2 ,...,u n )) 



det 



d(x 1 ,x 2 , ...,x n ) 



d(u!,u 2 , ■ ■ ■ ,u n ) 



(3.7.5) 



dx\dx 2 ■ ■ ■ dx n . 



Note that (3.7.5) is just (3.7.1) with the matrix M replaced by the Jacobian of F. 

We will now look at two very useful special cases of the preceding result. See Problems 
22 and 23 for a third special case. 

Polar coordinates 

As an alternative to describing the location of a point P in the plane using its Cartesian 
coordinates (x, y), we may locate the point using r, the distance from P to the origin, and 
9, the angle between the vector from (0, 0) to P and the positive x-axis, measured in the 
counterclockwise direction from to 2ir (see Figure 3.7.3). That is, if P has Cartesian 
coordinates (x, y), with x ^ 0, we may define its polar coordinates (r, 9) by specifying that 



r = \J x 2 + y 2 



(3.7.6) 



6 
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and 

tan(0) = -, (3.7.7) 
x 



where we take O<0<7rify>O and it < 9 < 2tt if y < 0. If x = 0, we let 6 = \ if 
y > and = ^ if y < 0. For (x, y) = (0, 0), r = and could have any value, and so is 
undefined. Conversely, if a point P has polar coordinates (r,6), then 

£ = rcos(0) (3.7.8) 

and 

y = rsin(0). (3.7.9) 

Note that the choice of the interval [0, 2tt) for the values of is not unique, with any 
interval of length 2tt working as well. Although [0, 2ir) is the most common choice for 
values of 0, it is sometimes useful to use (— tt,tt) instead. 

Example If a point P has Cartesian coordinates (—1,1), then its polar coordinates are 

Example A point with polar coordinates (3, ^) has Cartesian coordinates (j^, |j- 
In our current context, we want to think of the polar coordinate mapping 

(x, y) = F(r, 9) = (r cos(0), r sin(0)) (3.7.10) 

as a change of variables between the r0-plane and the xy-plane. This mapping is particu- 
larly useful for us because it maps rectangular regions in the r0-plane onto circular regions 
in the xy-plane. For example, for any a > 0, F maps the rectangular region 

E = {(r, 0) : < r < a, < < 2tt} 

in the r0-plane onto the closed disk 

D = B 2 {(0, 0), a) = {{x, y):x 2 + y 2 < a} 

in the xy-plane (see Figure 3.7.5 below for an example). More generally, for any < a < 
(3 < 2tt, F maps the rectangular region 

E = {(r,0) :O<r<a,a<0</3} 

in the r0-plane onto a region D in the xy-plane which is the sector of the closed disk 
-B 2 ((0,0),a) which lies between radii of angles a and j3 (see Figure 3.7.4). Another basic 
example is an annulus: for any < a < b, F maps the rectangular region 

E = {(r, 0) : a < r < b, < < 2vr} 

in the r0-plane onto the annulus 

D = {{x,y) :a<x 2 +y 2 <b} 

in the xy-plane. Figure 3.7.6 illustrates this mapping for the upper half of an annulus. 




Example Let V be the volume of the region which lies beneath the paraboloid with 
equation z = 4 — x 2 — y 2 and above the sy-plane. In Section 3.6, we saw that 



V = J J (A-x-y 2 )dxdy = 8ir, 

where 

D = {(x,y) :x 2 + y 2 < 4}. 

The use of polar coordinates greatly simplifies the evaluation of this integral. With the 
polar coordinate change of variables 

x = r cos(#) 

and 

y = r sin(0), 

the closed disk D in the xy-plane corresponds to the closed rectangle 

E = {(r, 9) : < r < 2, < 9 < 2ir} 

in the r^-plane (see Figure 3.7.5). Note that in describing E we have allowed 9 = 2ir, 

o affect 
f(x, y) = 4 — x 2 — y 2 , then 



but this has no affect on our outcome since a line has no area in M 2 . Moreover, if we let 



/(F(r,0))=/(rcos(0),rsin(0)) 

= 4 - r 2 cos 2 (9) - r 2 sin(0) 
= 4-r 2 (cos 2 (6') + sin 2 (^) 



4-r 2 , 
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Figure 3.7.5 Polar coordinate change of variables maps [0,2] x [0, 2tt] to -B 2 ((0,0),2) 



which also follows from the fact that r 2 = x 2 + y 2 . Now 

d(x,y) 



so 



d(r,0) 



^rcos(^) J^rcos(0) 



cos(0) — rsin(#) 
sin(#) rcos(#) 



d(r,0) 

Hence, using (3.7.5), we have 



rcos 2 (#) + rsin 2 (#) = r(cos 2 (#) + sin 2 (#)) = r. 



(3.7.11) 



(3.7.12) 



(4 - x 2 - y 2 )dxdy = / / (4-r 2 ) 

D J JE 

2 r 2-ir 



det 



d(x,y) 



d(r,9) 

/ / (4 - r 2 )rd6dr 
Jo Jo 

2 



drdO 



27r(4r — r' i )dr 



2ir [ 2r 2 



= 2vr(8 - 4) 
= 8vr. 

Example Suppose D is the part of the region between the circles with equations x 2 +y 2 
1 and x 2 + y 2 = 9 which lies above the x-axis. That is, 



D = {(x, y) : 1 < x 2 + y 2 < 9, x > 0}. 
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Figure 3.7.6 Polar coordinates map [1,3] x [0, 7r] to top half of an annulus 



We wish to evaluate 



dxdy. 



D 



Under the polar coordinate change of variables 

x = r cos(#) 

and 

y = r sin(0), 

the annular region D corresponds to the closed rectangle 

E = {(r,6) : 1 < r < 3,0 < 9 < tt}, 

as illustrated in Figure 3.7.6. Moreover, x 2 + y 2 = r 2 and, as we saw in the previous 
example, 

d{x,y) 



det 



d(r,6) 



Hence 



dxdy 



D 



r. 



re~ r drdO 

E 

3 



re r dOdr 



1 Jo 

3 



nre r dr 



TT _ T .2 

— e 
2 



TT 



(e^-e- 9 ). 



Note that in this case the change of variables not only simplified the region of integra- 
tion, but also put the function being integrated into a form to which we could apply the 
Fundamental Theorem of Calculus. 




Spherical coordinates 

Next consider the following extension of polar coordinates to three space: given a point 
P with Cartesian coordinates (x,y,z), let p be the distance from P to the origin, 9 be 
the angle coordinate for the polar coordinates of (x, y, 0) (the projection of P onto the 
xy-plane), and let (p be the angle between the vector from the origin to P and the positive 
2-axis, measured from to n. If x ^ 0, we have 

p= v / x 2 + y 2 + z 2^ (3.7.13) 
tan(0) = |, (3.7.14) 

and 

cos O) = i=% 9 (3.7.15) 

+ y z + 2r 

where < 6 < 2tt and < ip < tt. As with polar coordinates, if x = we let 9 = ^ if 
y > 0, 9 = ^ if y < 0, and is undefined if y = 0. See Figure 3.7.7. Conversely, given 
a point P with spherical coordinates (p,9,ip), the projection of P onto the xy-plane will 
have polar coordinate r = psin((p). Hence the Cartesian coordinates of P are 

x = pcos(9) sin(^), (3.7.16) 

y = psin(9) sin(c^), (3.7.17) 

and 

z = pcos((p). (3.7.18) 



Example If a point P has Cartesian coordinates (2, —2, 1), then its spherical coordinates 
satisfy 

p = V4 + 4 + l = 3, 
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and 



Hence we have 



and 



tan(0) = = -1, 



cosfy?) = . 1 = = -. 

V4 + 4 + 1 3 

V? = cos" 1 = 1.2310, 



where we have rounded the value of (p to four decimal places. Hence P has spherical 
coordinates (3, ,1.2310). 

Example If a point P has spherical coordinates (4, |-, ^) , then its Cartesian coordinates 
are 

x = 4 cos sin ( -- ) = 4 f - ) ( -= ) =v / 2, 



37 V 4 / V 2 / V V2 



a = s sin ( D sin C x) = 4 ( ^ 



and 

*-*«-(t)- 4 (-^)-^- 

Analogous to our work with polar coordinates, we think of the spherical coordinate 
mapping 

(x,y,z) = F(p,6,(p) = (pcos(9) sin (93), psin(0) sm(ip), pcos(ip)) (3.7.19) 

as a change of variables between p9ip-spa.ee and xyz-space. This mapping is particularly 
useful for evaluating triple integrals because it maps rectangular regions in p#<p-space onto 
spherical regions in xyz-sp&ce. For the most basic example, for any a > 0, F maps the 
rectangular region 

E = {(p, 9, ip) : < p < a, < 9 < 2vr, < <p < vr} 

in p9tp-sp&ce onto the closed ball 

D = 5 3 ((0, 0, 0), a) = {(x, y, z) : x 2 + y 2 + z 2 < a} 

in xyz-space. More generally, for any < a < 6, < a < f3 < 2ir, and 0< / y<5<ir,F 
maps the rectangular region 

E = {(p, 9, ip) : a < p <b,a < 9 < /3,j < <p < 5} 
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onto a region D in xyz-space which lies between the concentric spheres >S 2 ((0, 0, 0), a) and 
5 2 ((0, 0, 0), b), and for which the angle 9 lies between a and (3 and the angle ip between 7 
and 5. For example, ifa = 0,/3 = 7r,7 = 0, and S = f , then D is one- half of the region 
lying between two concentric hemispheres with radii a and b. 

Before using the spherical coordinate change of variable in (3.7.19) to evaluate an 
integral using (3.7.5), we need to compute the determinate of the Jacobian of F. Now 



d(x, y, z) 
d(p,6,p) 



pcos((9) sin(p) 



d_ 
dp 

d 

—psm(6) sin(^) 

■^pcos(p) 
cos(6 l ) sm(ip) 



8_ 
06 



pcos(9) sin(p) 
psm(0) s'm(ip) 



d_ 

dip 

d_ 

dp 



pcos(6) sin(p) 
psin(9) sin(tp) 



8_ 
86 



pcos(<p) 



d_ 

dip 



pcos(tp) 



sin(^) s'm(p) 
cos(p) 

so, expanding along the third row, 
d(x,y,z) 



—psm(0) sin(p) p cos(6 l ) cos(p) 
p cos(9) sin(p) p sin(#) cos(p) 
— psin(9?) 



(3.7.20) 



det 



d(p,6,ip) 



cos(p)(—p sin (6) s'm(p) cos(p) — p cos (6) s'm(p) cos(p)) 

— ps'm(p)(pcos 2 (6) sin 2 (p) + psin 2 (#) sin 2 (<p) 
= -p 2 sin(<p) cos 2 (<p) (sin 2 (9) + cos 2 (6)) - p 2 sin 3 (<p) (sin 2 (0) + cos 2 (0)) 
= — p 2 sin(tp) cos 2 ((p) — p 2 sin 3 ((p) 
= — p 2 sin(<p)(cos 2 (<p) + sin 2 (ip)) 

= -p 2 sin((p). (3.7.21) 



Now p > and, since < p < it, sin(cp) > 0, so 

d(x,y,z) 



d(p,9,p) 



p 2 sin(<p). 



(3.7.22) 



Example In an earlier example we used the fact that the volume of a sphere of radius 1 
is In this example we will verify that the volume of a sphere of radius a is |-7ra 3 . Let 
V be the volume of 

D = B 3 ((0,0,0), a), 
the closed ball of radius a centered at the origin in M 3 . Then 



V = J J J dxdydz. 
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Although we may evaluate this integral using Cartesian coordinates, we will find it sig- 
nificantly easier to use spherical coordinates. Using the spherical coordinate change of 
variables 

x = pcos(#) sm(tp), 



y = psm(9)sm(if), 



and 



z = pcos(tp), 

the region D in xyz-space corresponds to the region 

E = {(p, 9, ip) : < p < a, < 6 < 2vr, < <p < tt} 
in pOip-space. Using (3.7.22) in the change of variables formula (3.7.5), we have 



V = 



dxdydz 




D 



E 



det 



d(x,y,z) 



dpdOdip 



1 [-2-K n-K 



p 2 sm(ip)d(pd9dp 



o J o Jo 

a p2ir 



J 

a / i 2-7T 



(— p cos((p))\ Q d6dp 



= / / (-p 2 (-l-l))d6dp 
Jo Jo 



pa p2tt 

= 2 / p 2 d6dp 
Jo Jo 

= 4ir [ p 2 dp 
Jo 

47T , 







= r a. 



Example Suppose we wish to evaluate 



j j j 1°§ \/x 2 +y 2 + z 2 dxdydz, 



where D is the region in M 3 which lies between the two spheres with equations x 2 +y 2 +z 2 = 
1 and x 2 + y 2 + z 2 = 4 and above the xy-plane. Under the spherical coordinate change of 
variables 

x = pcos(#) sin(^), 
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and 



Change of Variables in Integrals 
y = psm(6>)sin(<p), 
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z = pcos(ip), 

the region D in xyz-space corresponds to the region 

E = {(p,0,<p) :l<p<2,0<6><27r,0<v?<!J 
in p#<p-space. Using (3.7.22) in the change of variables formula (3.7.5), we have 

J J J log v 7 * 2 + V 2 + z 2 dxdydz = J J J log(p) a(x ' y ' z ) 



d(p,0,ip) 



dpdOdip 



2 r 2ir 



1 J J 

2 r 2n 



1 J 
2 /.2tt 



1 J 
2 /.2tt 



p 2 log(p) sm(<p)dipd0dp 

(-p 2 io g (p)cos(<p))\^dedp 

(-p 2 log(p))(0-l)d9dp 
= j* p 2 \og(p)d9dp 

We use integration by parts to evaluate this final integral: letting 

u = log(p) dv = p 2 dp 

du = - dp v = — , 
p 3 



we have 



J J J log y/x 2 +y 2 + z 2 dxdydz = 2tt ^p 3 log(p) ~\ J^ P 2(i P 



16 , , n . 2vrp 3 

16 , 14vr 

T *k»«(2)- T - 



2tt 

Y 



81og(2)- 
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Problems 

1. Find the area of the region enclosed by the ellipse with equation x 2 + 4y 2 = 4. 

2. Given a > and b > 0, show that the area enclosed by the ellipse with equation 

2 2 
— + V — = 1 

a 2 b 2 

is 7ra6. 

3. Find the volume of the region enclosed by the ellipsoid with equation 

2 2 
X o Z , 

25^ + 4 = L 

4. Given a > 0, 6 > 0, and c > 0, show that the volume of the region enclosed by the 
ellipsoid 

2 2 2 

ar y z z z 
— v- — I — = 1 
a 2 b 2 c 2 



is ^Trabc. 



5. Find the polar coordinates for each of the following points given in Cartesian coordi- 
nates. 

(a) (1,1) (b) (-2,3) 

(c) (-1,3) (d) (4,-4) 

6. Find the Cartesian coordinates for each of the following points given in polar coordi- 
nates. 

(a) (3,0) (b) (2,^) 

(c) (5,tt) (d) U,f 



7. Evaluate 



where D is the disk in M 2 of radius 2 centered at the origin. 



8. Evaluate 



J j sin(x 2 + y 2 )dxdy, 



where D is the disk in M 2 of radius 1 centered at the origin. 



9. Evaluate 

1 



dxdy, 



Id x 2 +V 2 

where D is the region in the first quadrant of M 2 which lies between the circle with 
equation x 2 + y 2 = 1 and the circle with equation x 2 + y 2 = 16. 
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10. Evaluate 



J J log(x 2 + y 2 )dxdy, 



where D is the region in M 2 which lies between the circle with equation x 2 + y 2 = 1 
and the circle with equation x 2 + y 2 = 4. 

11. Using polar coordinates, verify that the area of a circle of radius r is nr 2 . 



12. Let 



-L 



e 2 dx. 

oo 



(a) Show that 



(b) Show that 



(c) Show that 



/oo />oo 
/ e-^ x2+ y 2 Uxdy. 
-OO J —OO 



POO p2tt 2 

I 2 = / re'^dOdr. 
Jo Jo 



dx = V2it- 

-oo 



13. Find the spherical coordinates of the point with Cartesian coordinates (—1, 1, 2). 

14. Find the spherical coordinates of the point with Cartesian coordinates (3,2, —1). 

15. Find the Cartesian coordinates of the point with spherical coordinates (2, 2 J I ). 

16. Find the Cartesian coordinates of the point with spherical coordinates (5, |). 

17. Evaluate 



where D is the closed ball in M 3 of radius 2 centered at the origin. 
18. Evaluate 

r r r i 

dxdydz, 




d \/ x 2 + y 2 + z 2 



where D is the region in M 3 between the two spheres with equations x 2 + y 2 + z 2 = 4 
and x 2 + y 2 + z 2 = 9. 



19. Evaluate 



J J J si n (\/ x 2 + y 2 + z 2 )dxdydz, 



' D 

where D is the region in R 3 described by x > 0, y > 0, z > 0, and x 2 + y 2 + z 2 < 1. 
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20. Evaluate 



///, 



e- {x2+y2+z2) dxdydz, 

D 

where D is the closed ball in M 3 of radius 3 centered at the origin. 

21. Let D be the region in M 3 described by x 2 + y 2 + z 2 < 1 and z > \J x 2 + y 2 . 
(a) Explain why the spherical coordinate change of variables maps the region 



E = {(p,6,<p) : < p < 1,0 < 6 <2ir,0 < <p < ^} 



onto D. 
(b) Find the volume of D. 

22. If a point P has Cartesian coordinates (x,y,z), then the cylindrical coordinates of P 
are (r, 9, z), where r and 6 are the polar coordinates of (x,y). Show that 



d(r,e,z) 



r. 



23. Use cylindrical coordinates to evaluate 

\/ x 2 + y 2 dxdydz , 



where D is the region in R described by 1 < x 2 + y 2 < 4 and < z < 5. 

24. A drill with a bit with a radius of 1 centimeter is used to drill a hole through the center 
of a solid ball of radius 3 centimeters. What is the volume of the remaining solid? 

25. Let D be the set of all points in the intersection of the two solid cylinders in M 3 
described by x 2 + y 2 < 1 and x 2 + z 2 < 1. Find the volume of D. 
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Geometry, Limits, and 
Continuity 



In this chapter we will treat the general case of a function mapping R m to M. n . Since the 
cases m = 1 and n = 1 have been handled in previous chapters, our emphasis will be on 
the higher dimensional cases, most importantly when m and n are 2 or 3. We will begin 
in this section with some basic terminology and definitions. 

Parametrized surfaces 

If / : R m -> M" has domain D, we call the set S of all points y in R n for which y = /(x) 
for some x in D the image of /. That is, 

S = {/(x):xeO}, (4.1.1) 

which is the same as what we have previously called the range of /. If m = 1, S is a 
curve as defined in Section 2.1. If m > 1 and n > m, then we call S an m- dimensional 
surface in R n . If we let x = (xi,X2, • • • , x m ) and (yi, y 2 , ■ ■ ■ , y n ) = f(xi,%2, ■ ■ ■ , x n ), then, 
for fc = 1, 2, . . . , n, we call the function /& : M n — >■ R defined by 

fk(x 1 ,x 2 , ■ ■ .,x m ) = y k 

the k-th coordinate function of /. We call the system of equations 

Vi = fi(xi,x 2 , • • • ,x m ), 
2/2 = f2(xi,x 2 , • • -,x m ), 

(4.1.2) 

Vn fn (^1 5 X 2 , • ■ • , ^m) ; 

a parametrization of the surface <S. Note that is the type of function we studied in 
Chapter 3. On the other hand, if we fix values of Xi for all i k, then the function 
ip k : R -)■ M n defined by 

<Pk(t) = f(xx,X 2 , . . .,Xk-l,t,Xk+l, ■ ■ -,Xm) (4.1.3) 

is of the type we studied we Chapter 2. In particular, for each k = 1,2, ...,n, ipk 
parametrizes a curve which lies on the surface S. The following examples illustrate how 
the latter remark is useful when trying to picture a parametrized surface S. 

Example Consider / : R 2 -> R 3 defined by 

f(s,t) = (tcos(s),tsm(s),t) 



1 
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Figure 4.1.1 Cone parametrized by f(s,t) = (tcos(s),tsin(s),t) 

for < s < 2tt and — oo < t < oo. The image of / is the surface S in R 3 parametrized by 
the equations 

x = t cos(s), 
y = tsin(s), 
z = t. 

Note that for a fixed value of t, the function 

<pi{s) = (icos(s),tsin(s),t) 

parametrizes a circle of radius t on the plane z = t with center at (0, 0,t). On the other 
hand, for a fixed value of s, the function 

f2(t) = (icos(s),<sin(s),t) = t(cos(s), sin(s), 1) 

parametrizes a line through the origin in the direction of the vector (cos(s), sin(s), 1. Hence 
the surface S is a cone in M 3 , part of which is shown in Figure 4.1.1. Notice how the surface 
was drawn by plotting the curves corresponding to fixed values of s and t (that is, the curves 
parametrized by tpi and ^2), and then filling in the resulting curvilinear "rectangles." 

Example For a fixed a > 0, consider the function / : M 2 — > M 3 defined by 
f(s,t) = (acos(s) sin(i), asin(s) sin(i), a cos(t)) 
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Figure 4.1.2 Unit sphere parametrized by f(s,t) = (cos(s) sin(t), sin(s) sin(t), cos(t)) 

for < s < 2tt and < t < tt. The image of / is the surface S in R 3 parametrized by the 
equations 

x = acos(s) sin(t), 

y = asin(s) sin(t), (4-1-4) 
z = acos(t). 

Note that these are the equations for the spherical coordinate change of variables discussed 
in Section 3.7, with p = a, 8 = s, and ip = t. Since a is fixed while s varies from to 2tt and 
t varies from to tt, it follows that S is a sphere of radius a with center (0,0,0). Figure 
4.1.2 displays S when a = 1. If we had not previously studied spherical coordinates, we 
could reach this conclusion about S as follows. First note that 

x 2 + y 2 + z 2 = a 2 cos 2 (s) sin 2 (t) + a 2 sin 2 (s) sin 2 (t) + a 2 cos 2 (t) 

= a 2 sin 2 (t) (cos 2 (s) + sin 2 (s)) + a 2 cos 2 (£) 

= a 2 (sin 2 (i) + cos 2 (t)) 

= a 2 , 

from which it follows that every point of S lies on the sphere of radius a centered at the 
origin. Now for a fixed value of t, 

= (acos(s) sin(i), asin(s) sin(t), acos(t)) 

parametrizes a circle in the plane z = acos(t) with center (0, 0, acos(t)) and radius asin(t). 
As t varies from to tt, these circles vary from a circle in the z = a plane with center 
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(0, 0, a) and radius (when t = 0) to a circle in the xy-plane with center (0, 0, 0) and radius 
a (when t = |) to a circle in the z = —a plane with center (0, 0, —a) and radius (when 
t = it). In other words, the circles fill in all the "lines of latitude" of the sphere from the 
"North Pole" to the "South Pole," and hence S is all of the sphere. One may also show 
that the functions 

f2{t) = (acos(s) sin(i), asin(s) sin(i), acos(t)) 

parametrize the "lines of longitude" of S as s varies from to 2ir. Both the lines of 
"latitude" and "longitude" are visible in Figure 4.2.2. 

Example Suppose < b < a and define / : R 2 ->■ M 3 by 

f(s, t) = ((a + 6cos(t)) cos(s), (a + 6cos(i)) sin(s), 6sin(t)) 

for < s < 2tt and < t < 2tt. The image of / is the surface T parametrized by the 
equations 

x = (a + fecos(t)) cos(s), 
y = (a + fecos(i)) sin(s), 
z = 6sin(t). 

Note that for a fixed value of t, 

(fii(s) = ((a + 6cos(t)) cos(s), (a + &cos(t)) sin(s), &sin(t)) 

parametrizes a circle in the plane z = bsm(t) with center (0, 0, 6sin(t) and radius a+bcos(t). 
In particular, when t = 0, we have a circle in the xy-plane with center (0, 0, 0) and radius 
a + b; when t = f , we have a circle on the plane z = b with center (0, 0, b) and radius a; 
when t = 7r, we have a circle on the xy-plane with center (0, 0, 0) and radius a — b; when 
t = 3 f-, we have a circle on the z = — b plane with center (0, 0, —b) and radius a; and when 
t = 2tt, we are back to a circle in the xy-plane with center (0,0,0) and radius a + b. For 
fixed values of s, the curves parametrized by 

<P2{t) = ((o + &cos(£)) cos(s), (a + 6cos(t)) sin(s), 6sin(i)) 

are not identified as easily. However, some particular cases are illuminating. When s = 0, 
we have a circle in the xz-plane with center (a, 0, 0) and radius b; when s = | , we have 
a circle in the yz-plane with center (0, a, 0) and radius b; when s = ir, we have a circle 
in the xz-plane with center (—a, 0,0) and radius b; when s = 3f , we have a circle in the 
yz-plane with center (0, — a, 0) and radius b; and when t = 2ir, we are back to a circle in 
the xz-plane with center (a, 0, 0) and radius b. Putting all this together, we see that T is 
a torus, the surface of a doughnut shaped object. Figure 4.1.3 shows one such torus, the 
case a = 3 and 6=1. 
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Figure 4.1.3 A torus: f(s, t) = (3 + cos(i)) cos(s), (3 + cos(t)) sin(s), sin(t)) 



Vector fields 

We call a function / : R n — > R n , that is, a function for which the domain and range space 
have the same dimension, a vector field. We have seen a few examples of such functions 
already. For example, the change of variable functions in Section 3.7 were of this type. 
Also, given a function g : R n — > R, the gradient of g, 



V<?(x) 



() ^( x )'^5(x),..., 1 ^-5(x) 



dx i 



8x r 



is a function from R n to R n . As we saw in our discussion of gradient vector fields in Section 
3.2, a plot showing the vectors /(x) at each point in a rectangular grid provides a useful 
geometric view of a vector field /. 



Example Consider the vector field / : I 

/(x) 



defined by 



x 



for all x ^ 0. Note that /(x) is a vector of length 



x 



x 



1 



|x|| ll x ll 

pointing in the direction opposite that of x. If n = 2, the coordinate functions of / are 



and 



h(x ll x 2 ) 
f 2 (x 1 ,x 2 ) ■ 



rp 2 I r, i 2 



X2 

ry> 2 I /r*2 
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Figure 4.1.4 shows a plot of the vectors /(x) for this case, drawn on a grid over the rectangle 
[—3, 3] x [—3, 3], and for the case n = 3, using the cube [—3, 3] x [—3, 3] x [—3, 3]. Note that 
these plots do not show the vectors /(x) themselves, but vectors which have been scaled 
proportionately so they do not overlap one another. 

Limits and continuity 

The definitions of limits and continuity for functions / : M m — > I" follow the familiar 
pattern. 

Definition Let a be a point in M m and let O be the set of all points in the open ball of 
radius r > centered at a except a. That is, 

O = {x : x e £ n (a,r),x^ a}. 

Suppose / : M m — > M. n is defined for all x in O. We say the limit of /(x) as x approaches 
a is L, written lim /(x) = L, if for every sequence of points {x^} in O, 

x— >a 

lim /(x fc ) = L (4.1.5) 

k— >oo 

whenever lim x^ = a. 

k— >oo 

In Section 2.1 we saw that a sequence of points in R n has a limit if and only if the 
individual coordinates of the points in the sequence each have a limit. The following 
proposition is an immediate consequence. 
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Proposition If fk : R m — >■ R, fc = 1, 2, . . . , n, is the fcth coordinate function of / : R m — > 
R n , then 

lim /(x) = {L 1 ,L 2 , ...,L n ) 

x— >a 

if and only if 

lim / fc (x) = L k 

x— s-a 

for fc = 1, 2, . . . , n. 

In other words, the computation of limits for functions / : R m — > 1" reduces to the 
familiar problem of computing limits of real-valued functions, as we discussed in Section 
3.1. 

Example If 

f(x,y,z) = {x 2 - 3yz,4xz), 
a function from R 3 to R 2 , then, for example, 

lim f(x,y,z) = I lim (x 2 — 3yz), lim 4xz I =(19,12). 

(x,2/,*)-Kl -2,3) Vfof'^O-K 1 '- 2 ' 3 ) (x,j/,2)->(1,-2,3) / 

Definition Suppose / : M m — >■ M. n is defined for all x in some open ball B n (a., r), r > 0. 
We say / is continuous at a if lim /(x) = /(a). 

x— >a 

The following result is an immediate consequence of the previous proposition. 

Proposition If fk : M m — >■ M, fc = 1, 2, . . . , n, is the fcth coordinate function of / : M m — > 
M n , then / is continuous at a point a if and only if is continuous at a for fc = l,2,...,n. 

In other words, checking for continuity for a function / : M m — > R n reduces to checking 
the continuity of real- valued functions, a familiar problem from Section 3.1. 

Example The function 

f(x, y) = (3 sin(x + y), 4x 2 y) 

has coordinate functions 

fi(x,y) = 3sm(x + y) 

and 

f 2 (x,y) = Ax 2 y. 

Since, from our results in Section 3.1, both j\ and f 2 are continuous at every point in R 2 , 
it follows that / is continuous at every point in R 2 . 

Definition We say a function / : R m — > R n is continuous on an open set U if / is 
continuous at every point u in U. 

Example We may restate the conclusion of the previous example by saying that 

f(x, y) = (3 sin(x + y), 4x 2 y) 

is continuous on R . 
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Problems 



1. For each of the following, plot the surface parametrized by the given function. 

(a) f(s, t) = (t 2 cos(s), t 2 sin(s), t 2 ), < s < 2vr, < t < 3 

(b) f(u, v) = (3 cos(u) sin(-u), sin(it) sin(u), 2 cos(v )), < u < 2tt, < v < tt 

(c) g( s , t) = ((4 + 2 cos(t)) cos(s), (4 + 2 cos(i)) sin(s), 2 sin(t)), < s < 2tt, < t < 2vr 

(d) f(s,t) = ((5 + 2cos(t))cos(s),2(5 + 2cos(t))sin(s),sin(t)), < s < 2tt, < t < 2vr 

(e) v) = (sin(u), (3 + cos(w)) cos(«), (3 + cos(u)) sin('u)), < u < 2tt, < v < 2ir 

(f) g(s, t) = (s, s 2 + t 2 , t), -2 < s < 2, -2 < t < 2 

(g) f(x,y) = (ycos(x),y,ysin(x)), < x < 2ir, -5 < y < 5 

2. Suppose / : M 2 ->• R and we define F : M 2 ->• M 3 by = (s,t, f(s,t)). Describe 
the surface parametrized by F. 

3. Find a parametrization for the surface that is the graph of the function f(x,y) = 
x 2 + y 2 . 

4. Make plots like those in Figure 4.1.4 for each of the following vector fields. Experiment 
with the rectangle used for the grid, as well as with the number of vectors drawn. 

(a) f{x,y) = (y,-x) 

(b) g(x,y) = (y, -sin(x)) 

(c) f(u, v) = (v, u — u 3 — v) 



(d) f(x,y) = (x(l -y 2 ) -y,x) 

(e) f(x, y, z) = ( 10(y - x), 28x - y - xz,--z + xy 




^(u - l,v - 2, w - 1) 



5. Find the set of points in M 2 for which the vector field 




is continuous. 



6. For which points in M n is the vector field 



/(x) 



X 



log(H) 



a continuous function? 
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Best affine approximations 

The following definitions should look very familiar. 

Definition Suppose / : R m — > R n is defined on an open ball containing the point c. 
We call an affine function A : M m — > MJ 1 the best affine approximation to / at c if (1) 
A(c) = /(c) and (2) ||i?(h)|| is o(h), where 

R(h) = /(c + h)- A(c + h). (4.2.1) 

Suppose A : M n — > M n is the best affine approximation to / at c. Then, from our work 
in Section 1.5, there exists an n x m matrix M and a vector b in M 71 such that 

A(x) = Mx + b (4.2.2) 

for all x in M m . Moreover, the condition A(c) = /(c) implies /(c) = Mc + b, and so 
b = /(c) — Mc. Hence we have 

A(x) = Mx + /(c) - Mc = M(x - c) + /(c) (4.2.3) 

for all x in M m . Thus to find the best affine approximation we need only identify the 
matrix M in (4.2.3). 

Definition Suppose / : M m — > MJ 1 is defined on an open ball containing the point c. If 
/ has a best affine approximation at c, then we say / is differ entiable at c. Moreover, if 
the best affine approximation to / at c is given by 

A(x) = M(x-c) + /(c), (4.2.4) 

then we call M the derivative of / at c and write Df(c) = M. 

Now suppose / : M m — > M n and A is an affine function with A(c) = /(c). Let fk and 
Ak be the fcth coordinate functions of / and A, respectively, for k = 1, 2, . . . , n, and let R 
be the remainder function 

R(h) = f(c + h) - A(c + h) 

= (/ x (c + h) - Ai(c + h), / a (c + h) - A 2 (c + h), . . . , / n (c + h) - A n (c + h)). 

1 Copyright © by Dan Sloughter 2001 
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Then 
h 

and so 



/i(c + h)-Ai(c + h), / 2 (c + h)-yl 2 (c + h) /„(c + h)-A n (c + h) 



-), 



lim " M1 \, " = 0, 
h->o h 



that is, A is the best affine approximation to / at c, if and only if 

Hm fk(c + h)-A k (c + h) = Q 
h->0 llhll 



(4.2.5) 



(4.2.6) 



for k = 1, 2, . . . , n. But (4.2.6) is the statement that A k is the best affine approximation to 
f k at c. In other words, A is the best affine approximation to / at c if and only if A k is the 
best affine approximation to f k at c for k = 1, 2, . . . , n. This result has many interesting 
consequences. 

Proposition If f k : R m -> R is the fcth coordinate function of / : R m -> IT, then / is 
differentiable at a point c if and only if f k is different iable at c for k = 1, 2, . . . , n. 



l , then we say 



Definition If f k : M m — >■ R is the fcth coordinate function of / : 
/ is C 1 on an open set U if f k is C 1 on [/ for /c = 1, 2, . . . , n. 

Putting our results in Section 3.3 together with the previous proposition and definition, 
we have the following basic result. 



Theorem If / : 

differentiable at c. 

Suppose / : R m 
mat ion A and f k : 



is C 1 on an open ball containing the point c, then / is 



is differentiable at c — (ci, c 2 , . . . , c m ) with best affine approxi- 



and A k : 



are the coordinate functions of / and A, 



respectively, for k = 1, 2, . . . , n. Since A k is the best affine approximation to f k at c, we 
know from Section 3.3 that 



for all x in 



A(x) = 



A fc (x) = V/ fc (c)-(x-c) + / fc (c) 
Hence, writing the vectors as column vectors, we have 



Ax(x) 

^2(X) 



(4.2.7) 



V/i(c)-(x-c) + /i(c) 
V/ 2 (c)-(x-c) + / 2 (c) 



V/„(c)-(x-c) + / n (c) 
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d 


8 

a*™ 


8 

dx 2 


8 




r\ /n(c) • 
OX 2 


d 
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/i(c) 
/2(c) 



- Xi 


~ Ci " 




r/i(c)i 


X2 


- c 2 




/ 2 (c) 






+ 




- %m 






-/m(c). 



. (4.2.8) 



It follows that the n x m matrix in (4.2.8) is the derivative of /. 

Theorem If / : M m — > M, n is differentiable at a point c, then the derivative of / at c is 
given by 













D/(c) = 






■ fMO 


(4.2.9) 




Yd X J" [c) 




■ — fn(c) 





We call the matrix in (4.2.9) the Jacobian matrix of /, after the German mathematician 
Carl Gustav Jacob Jacobi (1804-1851). Note that we have seen this matrix before in our 
discussion of change of variables in integrals in Section 3.7. 

Example Consider the function / : M 3 — > R 2 defined by 

f{x,y,z) = {xyz,3x - 2yz). 
The coordinate functions of / are 

fi{x,y,z) = xyz 



and 

Now 

and 

so the Jacobian of / is 
Hence, for example, 



f 2 (x,y,z) =3x- 2yz. 
Vfi(x,y,z) = (yz,xz,xy) 
Vf 2 (x,y,z) = (3,-2z,-2y), 

Df(x,y,z) 



yz xz xy 
3 -2z -2y 



Af (1,2,-1) 



-2 -1 2 
3 2-4 
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Since /(l, 2,-1) = (—2, 7), the best affine approximation to / at (1, 2, —1) is 

A(x,y,z) = 



-2 
3 



-1 
2 



+ 



x- 1 

y-2 

2 + 1 

-2(x- 1)- (y-2) + 2(z + l)-2 
3(x-l) + 2(y-2)-4(z + l) + 7 

-2x-y + 2z + 4 
3x + 2y - 4^ - 4 



A(s,t) = 



Tangent planes 

Suppose / : M 2 — > M 3 parametrizes a surface S in M 3 . If /i , fa , and /3 are the coordinate 
functions of /, then the best affine approximation to / at a point (sq, to) is given by 

a_ 

ds' 
8_ 

a? 
a^ 
ds- 
a_ 

ds' 

a_ 
d~s- 
a^ 
ds- 



/l(*0, So) 


^/i(<o,s ) 








f2(to, So) 


^f2(t ,S ) 




s - so 
. t - to . 


+ 


/3(*0, f ) 


^/3(*o,ao) 








/i(«o,*o) 




^/i(»o,*o) 


/2(«0 5 *o) 


(s - s ) + 


^/2(so,*o) 


/3(«0,*o) 




^/3(S0,*0) 



/i(so,*o) 

/2(fiO,*o) 
/3(«0,*o) 



(* " to) + 



/i(so,*o) 

/2(so,*o) 

h(so,t ) 



If the vectors 



v = 



and 



w = 



a_ 
a~s~ 
a^ 
as 
a_ 

5s 

_a_ 
at 
a_ 
at 
a^ 
.at 



/i(so,*o) 

/2(«0,*o) 
/3(so,*o) 

/i(«o,*o) 

/2(S0,<0) 
/3(«0,*o) 



(4.2.10) 



(4.2.11) 



(4.2.12) 



hich 



are linearly independent, then (4.2.10) implies that the image of A is a plane in K 
passes through the point /(so,to) on the surface 5. Moreover, if we let Ci be the curve 
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on S through the point f(so,to) parametrized by tpi(s) = f(s,to) and Ci be the curve on 
S through the point f(so,to) parametrized by <p2(t) = f(so,t), then v is tangent to C\ at 
f(so,to) and w is tangent to C2 at f(so,to). Hence we call the image of A the tangent 
plane to the surface S at the point f(so,to). 

Example Let T be the torus parametrized by 

f(s, t) = ((3 + cos(t)) cos(s), (3 + cos(i)) sin(s), sin(t)) 
for < s < 2tt and < t < 2vr. Then 



Df(s,t) = 



— (3 + cos(i)) sin(s) 
(3 + cos(i)) cos(s) 




sin(t) cos(s) 
- sin(t) sin(s) 
cos(t) 



Thus, for example, 



Df 



TT TT 

2' 4 



Since 



TT TT 

2' 4 



3 + 





0,3 + 



V2 



V2 
1 

71 



1 1 

71' 71 



the best affine approximation to / at (f , f ) is 

1 



A(s,t) = 



3 + 



3 + 






V2 







V2 
1 

71 



t - 



+ 







3 + 



V2 
1 

71 



v/2 



+ 



" " 

1 

V2 
1 






1 

3+ -= 
V2 

1 


. 71- 




. 71 - 



Hence 



x = - 3 + 



v/2 



71 (« 



1 /, 7T\ 



1 



1 

71' 
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X 



-4 




2 



z 







y 



4 



Figure 4.2.1 Torus with a tangent plane 



are parametric equations for the plane P tangent to T at ^0,3 + ^j- See Figure 
4.2.1. 

Chain rule 

We are now in a position to state the chain rule in its most general form. Consider functions 
g : R m — > R 9 and / : R 9 — > R n and suppose g is differentiable at c and / is differentiable 
at g(c). Let h : R m — > R n be the composition /i(x) = /(g(x)) and denote the coordinate 
functions of /, g, and h by /j, i = 1,2, ... ,n, gj, j = 1,2 ... ,q, and hk, k = 1,2, ... ,n, 
respectively. Then, for k = 1, 2, ... ,n, 

h k (x 1 ,x 2 , ■ ■ .,x m ) = fk(gi{xi,x 2 , . . ■ ,x m ),g 2 (x 1 ,X2, ■ . . ,x m ), . . .,g q (x 1 ,x 2 , ■ ■ -,x m )). 

Now if we fix m — 1 of the variables x\, x 2 , ■ ■ ■ , x m , say, all but xj, then hk is the 
composition of a function from R to R 9 with a function from R 9 to R. Thus we may use 
the chain rule from Section 3.3 to compute J^hk(c), namely, 



Hence g|r^fc(c) is equal to the dot product of the kth row of Df(g(c)) with the jth column 
of Dg(c). Moreover, if g is C 1 on an open ball about c and / is C 1 on an open ball about 
g(c), then (4.2.13) shows that gf-^-fe is continuous on an open ball about c. It follows from 

our results in Section 3.3 that h is differentiable at c. Since jf-hk is the entry in the kth 

oxj 

row and jth column of Dh(c), (4.2.13) implies Dh(c) = D f (g(c))Dg(c) . This result, the 
chain rule, may be proven without assuming that / and g are both C 1 , and so we state 
the more general result in the following theorem. 



dx 




^A(<7(c))^(c). 



(4.2.13) 
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Chain Rule If g : R m -> R q is differentiable at c and / : R q -> R" is differentiable at 
g(c), then f o g is differentiable at c and 

£>(/ o g)(c) = Df(g(c))Dg(c). (4.2.14) 



Equivalently, the chain rule says that if A is the best affine approximation to g at c and 
B is the best affine approximation to / at g(c), then Bo A is the best affine approximation 
to / o g at c. That is, the best affine approximation to a composition of functions is the 
composition of the individual best affine approximations. 



Example Suppose g : 



is defined by 



g(s,t) = (cos(s) sin(i), sin(s) sin(t), cos(t)) 



and / : 



Then 



d2 ; 



is defined by 



f(x, y, z) = (lOxyz, x 2 - yz). 



Dg(s,t) 



sin(s) sin(t) cos(s) cos(i) 
cos(s) sin(t) sin(s) cos(t) 
- sin(t) 



and 



lOyz lOxz lOxy 
2x —z —y 



Df(x,y,z) : 

Let h(s,t) = f(g(s,t)). To find Dh (f , f ), we first note that 

/7T 7T 

Hi'4 



1 1 1 

2'2'Tl; ' 



^ /7T 7T\ 



1 ■ 

2 
1 
2 
1 

7i. 



and 
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Thus 



™(H)= D '('(H))Mi- 



5 5 

71 2 

1 1 

71 ~2 



5 

71 
i 

2^ 
l + v^ 1 
2^ 2 



7T 

4 

1 

"2 
1 
2 



1 ■ 

2 

1 

2 

1 

71. 



Problems 



1. Find the best affine approximation for each of the following functions at the specified 
point c. 

(a) f(x,y) = (x 2 + y 2 ,3xy), c= (1,2) 

(b) g(x,y,z) = (sin(x + y + z), xy cos(z)), c = (0, f , f) 

(c) h(s, t) = (3s 2 +t,s-t, 4st 2 ,4t - s), c = (-1, 3) 

2. Each of the following functions parametrizes a surface S* in M 3 . In each case, find 
parametric equations for the tangent plane P passing through the point f(so,to). Plot 
S and P together. 

(a) f(s,t) = (tcos(s),tsm(s),t), (s ,t ) = (f ,2) 

(b) f(s,t) = (t 2 cos{s),t 2 ,t 2 sm(s)), (s , t ) = (0,1) 

(c) f(s,t) = (cos(s)sin(t),sin(s)sin(t),cos(t)), (s ,*o) = (f , f ) 

(d) f(s,t) = (3cos(s)sin(t),sin(s)sin(t),2cos(t)), (s ,t ) = (f,f) 

(e) /(s,t) = ((4 + 2cos(t))cos(s),(4 + 2cos(t))sin( S ),2sin(t)), (a ,to) = (x> f ) 

3. Let S be the graph of a function / : R 2 -> R. Define the function F : M 2 -)■ M 3 
by F(s,t) = (s,t, f(s,t)). We may find an equation for the plane tangent to S at 
{so, to, f(so, to)) using either the techniques of Section 3.3 (looking at S as the graph 
of /) or the techniques of this section (looking at S as a surface parametrized by F). 
Verify that these two approaches yield equations for the same plane, both in the special 
case when f(s, t) = s 2 + 1 2 and (so, to) = (1,2), and in the general case. 

4. Use the chain rule to find the derivative of fog at the point c for each of the following. 

(a) f[x, y) = (x 2 y, x - y), g(s, t) = (3st, s 2 -4t),c= (1, -2) 

(b) f(x, y, z) = (4xy, 3xz), g(s, t) = (st 2 - At, s 2 , , c = (-2, 3) 

(c) f(x, y) = (3x + Ay, 2x 2 y, x - y), g{s, t, u) = (4s -3t + u, 5st 2 ), c = (1, -2, 3) 
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5. Suppose 

x = f(u,v), 

y = g(u,v), 



and 



(a) Show that 



and 



u = h(s, t), 
v = k(s, t). 



dx dx du dx dv 
ds du ds dv ds 



dx dx du dx dv 
dt du dt dv dt 



(b) Find similar expressions for || and 

6. Use your results in Problem 5 to find ||, ||, and || when 

x = u 2 v, 
y = 3u — v , 

and 

u = At 2 - s 2 , 
4t 

v = — . 

s 

7. Suppose T is a function of x and y where 

x = rcos(9), 
y = r sin(0). 

Show that 

dT dT fn . dT . fn . 
^=dx- C ° S{e)+ dy- Sm{e) 

and 

dT dT . /n . dT 

- = --rsmW + -rcos(0). 

8. Suppose the temperature at a point (x, y) in the plane is given by 

r=ioo 20 



y/l + x 2 + y 2 ' 

(a) If (r, 9) represents the polar coordinates of (x,y), use Problem 7 to find §^ and 
^ when r = 4 and 9 = | . 

(b) Show that ^ = for all values of r and 9. Can you explain this result geometri- 
cally? 
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9. Let T be the torus parametrized by 

x = (4 + 2cos(t)) cos(s), 
y = (4 + 2cos(t))sin(s), 
z = 2sin(i), 

for < s < 2tt and < t < 2vr. 

(a) If U is a function of x, y, and z, find general expressions for §^ and 

(b) Suppose 

U = 80- 40e-^ 2 +f 2 + 22 ) 

gives the temperature at a point (x, y, z) on T. Find expressions for ^ and ^ in 
this case. What is the geometrical interpretation of these quantities? 

(c) Evaluate §^ and ^ in the particular case s = f and £ = f ■ 



The Calculus of Functions 
of 

Several Variables 



Section 4.3 
Line Integrals 



We will motivate the mathematical concept of a line integral through an initial discussion 
of the physical concept of work. 

Work 

If a force of constant magnitude F is acting in the direction of motion of an object along 
a line, and the object moves a distance d along this line, then we call the quantity Fd the 
work done by the force on the object. More generally, if the vector F represents a constant 
force acting on a object as it moves along a displacement vector d, then 



is the magnitude of F in the direction of motion (see Figure 4.3.1) and we define 



F 



Fd 



(4.3.2) 



to be the work done by F on the object when it is displaced by d. 




Figure 4.3.1 Magnitude of F in the direction of d is F • u, where u 



d 

IdJ 



We now generalize the formulation of work in (4.3.2) to the situation where an object 
P moves along some curve C subject to a force which depends continuously on position 
(but does not depend on time). Specifically, we represent the force by a continuous vector 
field, say, F : R n — > M n , and we suppose P moves along a curve C which has a smooth 



1 
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parametrization tp : / — > M n , where I = [a,b]. See Figure 4.3.2. To approximate the 
work done by F as P moves from ip(a) to <p(b) along C, we first divide I into m equal 
subintervals of length 

b — a 

At = 

m 

with endpoints to = a < t\ < t<i < ■ ■ ■ < t m = b. Now at time tk, k = 0, 1, . . . , m — 1, P 
is moving in the direction of D(p(tk) at a speed of ||.Dy?(ifc)||, and so will move a distance 
of approximately At over the time interval [tfc, ifc+i] • Thus we may approximate 

the work done by F as P moves from (f(tk) to ip(tk+i) by the work done by the force 
F{ip{tk)) in moving P along the displacement vector Dipft^At, which is a vector of length 
\\Dip(tk) || At in the direction of Dip(tk). That is, if we let Wk denote the work done by F 
as P moves from tp{tk) to (p(tk-i), then 

W k « F((p(t k )) ■ D<p(t k )At. (4.3.3) 

If we let W denote the total work done by F as P moves along C, then we have 

m — 1 m — 1 

W = w k = Yl F ^)) ■ D<p(t k )At. (4.3.4) 

fc=0 fc=0 



As m increases, we should expect the approximation in (4.3.4) to approach W. Moreover, 
since F(ip(t)) ■ D<p(t) is a continuous function of t and the sum in (4.3.4) is a left-hand rule 
approximation for the definite integral of F(ip(t)) ■ Dtp(t) over the interval [a, b], we should 
have 



m— 1 

W= lim J2F(<p(t k ))-D<p(t k )At = 




F(tp(t)) ■ Dtp(t)dt. (4.3.5) 



Section 4.3 Line Integrals 3 

Example Suppose an object moves along the curve C parametrized by (p(t) = (t,t 2 ), 
— 1 < t < 1, subject to the force F(x,y) = (y,x). Then the work done by F as the object 
moves from tp(— 1) = (—1, 1) to ip(l) = (1, 1) is 

W = J F(<p(t)) ■ D(p(t)dt 
= J F(t,t 2 ) ■ (l,2t)dt 
= j {t 2 ,t)-{l,2t)dt 




Example The function ip(t) 



t r 
2' 4 / ' 



-2 < t < 2, is also a smooth parametrization of 



the curve C in the previous example. Using the same force function F, we have 

4 ' 2 



2 z 

J F(m) ■ Dmdt = f_ 

i 



3t 2 



dt 



1 t 
2' 2 



= 2. 



This is the result we should expect: as long as the curve is traversed only once, the work 
done by a force when an object moves along the curve should depend only on the curve 
and not on any particular parametrization of the curve. 

We need to verify the previous statement in general before we can state our definition 
of the line integral. Note that in these two examples, ip(t) = V?(|)- In other words, 
i[)(t) = (p(g(t)), where g(t) = | for — 2 < t < 2. In general, if <p(t), for t in an interval [o, b], 
and ip(t), for t in an interval [c,d], are both smooth parametrizations of a curve C such 
that every point on C corresponds to exactly one point in I and exactly one point in J, 
then there exists a differentiable function g which maps J onto / such that ip(t) = cp(g(t)). 
Defining such a g is straightforward: given any t in [c, d], find the unique value s in [a, b] 
such that ip(s) = ip(t) (such a value s has to exist since C is the image of both ip and cp). 
Then g(t) = s. Proving that g is differentiable is not as easy, and we will not provide a 
proof here. However, assuming that g is differentiable, it follows that for any continuous 
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vector field F, 




D{<pog)(t))dt 



J c J c 




Dp(g(t))g'(t)dt. 



(4.3.6) 



Now if we let 




(4.3.7) 



if g(c) = a and g(d) = b (that is, if (a) = ip(c) and 9? (6) = ip(d), and 



a 



a 



F(ip(t)) ■ D^(t)dt = / F(p(u)) ■ Dp(u)dt = - / F((p(u)) ■ Dip(u)dt (4.3.8) 



if g(c) = b and g{d) = a (that is, tp{a) = ip(d) and tp(b) = ip{c). Note that the second case 
occurs only if ip parametrizes C in the reverse direction of ip, in which case we say tfj is an 
orientation reversing reparametrization of if. In the first case, that is, when <p(a) = ip(c) 
and p(b) = ?p(d), we say ip is an orientation preserving reparametrization of (p. Our results 
in (4.3.7) and (4.3.8) then correspond to the physical notion that the work done by a force 
in moving an object along a curve is the negative of the work done by the force in moving 
the object along the curve in the opposite direction. From now on, when referring to a 
curve C, we will assume some orientation, or direction, has been specified. We will then 
use — C to refer to the curve consisting of the same set of points as C, but with the reverse 
orientation. 

Line integrals 

Now that we know that, except for direction, the value of the integral involved in computing 
work does not depend on the particular parametrization of the curve, we may state a formal 
mathematical definition. 

Definition Suppose C is a curve in R n with smooth parametrization tp : I — > M. n , where 
/ = [a, b] is an interval in R. If F : M n — > M n is a continuous vector field, then we define 
the line integral of F along C, denoted 




by 




(4.3.9) 



F-ds 
<c 
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As a consequence of our previous remarks, we have the following result. 
Proposition Using the notation of the definition, 

/„ 

depends only on the curve C and its orientation, not on the parametrization tp. Moreover, 

f F-ds = - f F-ds. (4.3.10) 
J-c Jc 

Example Let C be the unit circle centered at the origin in M 2 , oriented in the counter- 
clockwise direction, and let 



To compute the line integral of F along C, we first need to find a smooth parametrization 
of C. One such parametrization is 

<p(t) = (cos(t), sin(i)) 

for < t < 2tt. Then 

r-2-K 



/ F-ds= / F(cos(t),sin(t)) • (- sm(t) , cos{t))dt 
Jc Jo 



2vr i 



cos 2 (t) + skr(i) 
(sin (t) + cos 2 (t))dt 

2- 

dt 



(— sin(t), cos(t)) • (— sin(i), cos(t))dt 



= 2tt. 

Note that ip(t) = (sin(i), cos(t)), < t < 2ir, parametrizes — C, from which we can calculate 



F ■ ds = / F(sin(t), cos(t)) • (cos(t), - sm(t))dt 
-c Jo 



r2n 

Jo 



sin 2 (f) + cos 2 (t) 



(— cos(t), sin(£)) • (cos(t), — sin(t))dt 



= / (-cos 2 (t) - sin 2 (t))dt 
Jo 

,-2-k 

dt 



-f 

JO 



= -2tt, 

in agreement with the previous proposition. 
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C 



1.5 
1.25 
1 

0.75 
4 0.5 
0.25 



it C 



0.5 



1 

c, 



1.5 



2.5 



Figure 4.3.3 Rectangle with counterclockwise orientation 



A piecewise smooth curve is one which may be decomposed into a finite number of 
curves, each of which has a smooth parametrization. If C is a piecewise smooth curve 
composed of the union of the curves C\, C%, . . . , C m , then we may extend the definition 
of the line integral to C by defining 



F-ds = F-ds + F-ds + --- + F ■ ds. (4.3.11) 
The next example illustrates this procedure. 

Example Let C be the rectangle in M 2 with vertices at (0,0), (2,0), (2,1), and (0,1), 

oriented in the counterclockwise direction, and let F(x,y) = (y 2 ,2xy). If we let C±, C2, 
C3, and C4 be the four sides of C, as labeled in Figure 4.3.3, then we may parametrize C\ 
by 

a(t) = (t,0), 



< t < 2, C 2 by 
< t < 1, C 3 by 
< t < 2, and C 4 by 
< t < 1. Then 



/3(t) = (2,t), 
7 (t) = (2-t,l), 
5(t) = (0,l-t), 



F ■ ds = / F-ds+ F ■ ds + / F ■ ds + F ■ ds 



1^2 /» 1 /»2 

/ F(t,0)-(l,0)dt+ F(2,t) ■ (0,l)dt+ / F(2 - t, 1) • (-1, 0)dt 
jo Vo Vo 
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+ [ F(0,l-t)-(0,-l)dt 
Jo 

= f (0,0) • (l,0)tft+ / (t 2 ,4t) • (0, l)tft + / (1,4 - 2t) • (-l,0)dt 
Jo Jo Jo 

+ [ ((l-t) 2 ,0)-(0,-l)dt 
Jo 

pi f\ /»2 f\ 

= 0dt+ 4tdt+ / (-l)dt + / Odt 
Jo Jo Jo Jo 

= 2t%-2 

= 2-2 

= 0. 

Note that it would be slightly simpler to parametrize — C3 and — C4, using 

<p(t) = (l,t), 

< t < 2, and 

^(*) = (*,0), 

< t < 1, respectively, than to parametrize C3 and C4 directly. We would then evaluate 
I F-ds = I F -ds+ I F-ds - I F ■ ds - j F-ds. 

A note on notation 

Suppose C is a smooth curve in M n , parametrized by tp : I — > E", where I = [a,b], and let 
F : R n — >■ M n be a continuous vector field. Our notation for the line integral of F along C 
comes from letting s = tp(t), from which we have 

which we may write, symbolically, as 

ds = Dip(t)dt. 

Now suppose ipi, ip2, • • • , tp n and F\, F2, . . . , F n are the component functions of <p and 
F, respectively. If we let 

xi = <pi(t), 

X 2 = V>2(*), 
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then we may write 
/ F ■ ds = [ F((p(t)) ■ D<p(t)dt 

JO J a 

= [ F( Xl (t),x 2 (t),...,x n (t))-(<p[(t),<p 2 (t),...,<p' n (t))dt 

J a 

= I (F 1 (x 1 (t),x 2 {t), x n (t))(p[{t) + F 2 ( Xl (t),x 2 (t), x n (t))(f/ 2 (t)) + ■■■ 

J a 

+ F n {x 1 {t),x 2 {t),...,x n {t))<p' n (t))dt 
= / F 1 (x 1 (t),x 2 (t),...,x n (t))<f' 1 (t)dt+ / F 2 (x 1 (t),x 2 (t),...,x n (t))<d 2 (t)dt 

J a J a 

+ •••+ [ F n { Xl {t),x 2 (t),...,x n (t))<p' n {t)dt. (4.3.12) 

J a 

Suppressing the dependence on t, writing dxk for (p' k (t)dt, k = 1, 2, . . . , n, and using only 
a single integral sign, we may rewrite (4.3.12) as 

/ F 1 (x 1 ,x 2 , . . .,x n )dx! + F 2 (x 1 ,x 2 , . . .,x n )dx 2 H h F n ( Xl ,x 2 , . . . ,x n )dx n . (4.3.13) 

Jc 

This is a common, and useful, notation for a line integral. 
Example We will evaluate 

/ ydx + xdy + z 2 dz, 
Jc 

where C is the part of a helix in M with parametric equations 

x = cos(i), 
y = sin(i), 

z = t, 

< t < 2tt. Note that this is equivalent to evaluating 

/ F-ds, 
Jc 

where F : M 3 M 3 is the vector field F(x, y, z) = (y, x, z 2 ). We have 

/ ydx + xdy + z 2 dz = I (sin(t)(— sin(t)) + cos(t) cos(t) + t 2 )dt 
Jc Jo 

= / (cos 2 (t) - sin 2 (t) +t 2 )dt 
Jo 

p2tt 

= / (cos(2t) + t 2 )dt 
Jo 



Svr 3 
3 ' 



2tt 



1 3 

+ ^ 



o 3 



2tt 
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Gradient fields 

Recall that if / : M n — > M is C 1 , then V/ is a continuous vector field on R 71 . Suppose 
cp : I — > M n , I = [a, 6], is a smooth parametrization of a curve C. Then, using the chain 
rule and the Fundamental Theorem of Calculus, 



/ Vf-ds= [ Vf(<p(t))-D<p(t)dt 

JC J a 



= f(<P(t))\ b a 

= f(tp(b)) - f{<p(a)). 

Theorem If / : R n — > R is C 1 and ip : I — > R n , I = [a,b], is a smooth parametrization 
of a curve C, then 

r Vf-ds = f(<p(b))-f(<p(a)). (4.3.14) 



L 



Note that (4.3.14) shows that the value of a line integral of a gradient vector field 
depends only on the starting and ending points of the curve, not on which particular 
path is taken between these two points. Moreover, (4.3.14) provides a simple means for 
evaluating a line integral if the given vector field can be identified as the gradient of a scalar 
valued function. Another interesting consequence is that if the beginning and ending points 
of C are the same, that is, if v = <p(a) = <p(b), then 

/ V/ • ds = f(<p(b)) - /(¥>(&)) = /(v) - /(v) = 0. (4.3.15) 
Jc 

We call such curves closed curves. In words, the line integral of a gradient vector field is 
along any closed curve. 

Example If F(x,y) = (y,x), then 

F(x,y) = Vf(x,y), 

where f{x,y) = xy. Hence, for example, for any smooth curve C starting at (—1, 1) and 
ending at (1,1) we have 



F-d S = /(l,l)-/(-l,l) = l + l = 2. 

c 

Note that this agrees with the result in our first example above, where C was the part of 
the parabola y = x 2 extending from (—1,1) to (1,1). 

Example If f(x, y) = xy 2 , then 



Vf(x,y) = (y 2 ,2xy). 



10 



Line Integrals 



Section 4.3 



If C is the rectangle in R 2 with vertices at (0,0), (2,0), (2, 1), and (0, 1), then, since C is 
a closed curve, 

/ y 2 dx + Ixydy = 0, 
Jc 

in agreement with an earlier example. Similarly, if E is the unit circle in R 2 centered at 
the origin, then we know that 

/ y 2 dx + Ixydy = 0, 

J E 

with no need for further computations. 

In physics, a force field F is said to be conservative if the work done by F in moving 
an object between any two points depends only on the points, and not on the path used 
between the two points. In particular, we have shown that if F is the gradient of some 
scalar function /, then F is a conservative force field. Under certain conditions on the 
domain of F, the converse is true as well. That is, under certain conditions, if F is a 
conservative force field, then there exists a scalar function / such that F = V/. Problem 
9 explores one such situation in which this is true. The function / is then known as a 
potential function. 



Problems 

1. For each of the following, compute the line integral / F ■ ds for the given vector field 

Jc 

F and curve C parametrized by tp. 

(a) F(x,y) = (xy,3x), tp(t) = (t 2 ,t), < t < 2 

(b) F(x, y) = (-^—2, -2T-2 ) ' = (cos(t), sin(t)), < t < 2ir 

\x z +y z x z + y z J 

(c) F(x, y) = (3x - 2y, 4x 2 y), <p(t) = (t 3 , t 2 ),-2<t<2 

(d) F(x, y, z) = (xyz, 3xy 2 ,4z), <p(t) = {St, t 2 , 4t 3 ), 0<t<4 

2. Let C be the circle of radius 2 centered at the origin in R 2 , with counterclockwise 
orientation. Evaluate the following line integrals. 

(a) / 3xdx + 4ydy (b) / 8xydx + 4x 2 dy 

Jc Jc 

3. Let C be the part of a helix in R 3 parametrized by <p(t) = (cos(2t),sin(2i),i), < t < 
2tt. Evaluate the following line integrals. 

(a) / 3xdx + 4ydy + zdz (b) / yzdx + xzdy + xydz 

Jc Jc 

4. Let C be the rectangle in R 2 with vertices at (—1,1), (2,1), (2,3), and (—1,3), with 
counterclockwise orientation. Evaluate the following line integrals. 

(.) [ Sy dx + (3y + x)iv (b) [2 xydx + x > d y 

Jc Jc 
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5. Let C be the ellipse in R 2 with equation 

2 2 

4 + 9 ' 

with counterclockwise orientation. Evaluate / F ■ ds for F(x,y) = (4y,3x). 

Jc 

6. Let C be the upper half of the circle of radius 3 centered at the origin in R 2 , with 
counterclockwise orientation. Evaluate the following line integrals. 

(a) / 3ydx (b) / 4xdy 

Jc Jc 



L 



7. Evaluate 

X d + ^ d 
c x 2 + y 2 J x 2 + y 2 

where C is any curve which starts at (1,0) and ends at (2,3). 

8. (a) Suppose F : 1" — > R n is a C 1 vector field which is the gradient of a scalar function 

/ : M. n — > K. If Fk is the kth coordinate function of F, k = 1, 2, . . . , n, show that 

d d 
—F i (x 1 ,x 2 , ...,x n ) = —F j (x 1 ,x 2 , ...,x n ) 

for i = 1, 2, . . . , n and j = 1, 2, . . . , n. 
(b) Show that although 

xdx + xydy = 



L 



>c 

02 



for every circle C in K with center at the origin, nevertheless F(x,y) = (x,xy) is 
not the gradient of any scalar function / : I" — > R. 

(c) Let 

V x 



F(x,y) 



x' z + y 2 ' x 2 + y 



for all (x,y) in the set S = {(x,y) : (x,y) ^ (0,0)}. Let F\ and F 2 be the 
coordinate functions of F. Show that 

^F^y) = -°-F 2 (x,y) 

for all (x, y) in S, even though F is not the gradient of any scalar function . (Hint: 
For the last part, show that 

/ F ■ ds = 2tt, 
Jc 

where C is the unit circle centered at the origin.) 
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Su 

C, 



9. Suppose F : R 2 — > R 2 is a continuous vector field with the property that for any curve 



L 



F-ds 

c 



depends only on the endpoints of C. That is, if C\ and Ci are any two curves with 
the same endpoints P and Q, then 



f F-ds= j F-ds. 

(a) Show that 



F • = 



/o 

for any closed curve C. 
(b) Let F\ and F 2 be the coordinate functions of F. Define / : M 2 — > R by 



/(x,y)= / F-ds, 
Jc 



where C is any curve which starts at (0,0) and ends at (x,y). Show that 

d 

— f(x,y) = F 2 {x,y). 

(Hint: In evaluating f(x, y), consider the curve C from (0, 0) to (x, y) which consists 
of the horizontal line from (0,0) to (x, 0) followed by the vertical line from (x, 0) 
to (x,y).) 

(c) Show that V/ = F. 
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Green's Theorem 



Green's theorem is an example from a family of theorems which connect line integrals (and 
their higher-dimensional analogues) with the definite integrals we studied in Section 3.6. 
We will first look at Green's theorem for rectangles, and then generalize to more complex 
curves and regions in R 2 . 

Green's theorem for rectangles 

Suppose F : M 2 -> R 2 is C 1 on an open set containing the closed rectangle 

D = [a, b] x [c, d], 

and let F\ and F2 be the coordinate functions of F. If C denotes the boundary of D, 
oriented in the clockwise direction, then we may decompose C into the four curves C±, C2, 
C3, and C4 shown in Figure 4.4.1. Then 




A C 2 



Figure 4.4.1 The boundary of a rectangle decomposed into four smooth curves 

a(t) = (t,c), 
a < t < 6, is a smooth parametrization of C\ , 

0(t) = (M), 



1 
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c < t < d, is a smooth parametrization of C 2 , 

7 (t) = (t,d), 

a < t < b, is a smooth parametrization of — C3, and 

5(t) = (a,t), 

c < t < d, is a smooth parametrization of — C4. Now 

F-ds= F ■ ds + / F -ds+ I F ■ ds + I F ■ ds 

J C J C\ J C*2 •J C3 </ O4 

= f F ds+ f F-ds-j F-ds-f F ■ ds, (4.4.1) 

</ C7i J Co J — (Jo J — Ca 



and 



and 



/ F-ds= I ((F 1 (t,c),F 2 (t,c))-(l,0)dt = [ F 1 (t,c)dt, (4.4.2) 

J C\ J a J a 

[ F ds= I ((F 1 (b,t),F 2 (b,t)) • (0, l)dt = [ F 2 (b,t)dt, (4.4.3) 

JC 2 Jc Jc 

/pb pb 

F-ds = / {(F 1 (t,d),F 2 (t,d))-{l,0)dt = / F 1 (t,d)dt, (4.4.4) 
-C3 J a J a 

[ F-ds= [ ((F 1 {a,t),F 2 {a,t))-{0,l)dt= [ F 2 (a,t)dt, (4.4.5) 

J-Ci Jc Jc 



-c 4 

Hence, inserting (4.4.2) through (4.4.5) into (4.4.1) 



i' i'b i'd i'b i'd 

J F-ds= J F 1 (t,c)dt+ / F 2 (b,t)dt- / F 1 (t,d)dt- / F 2 (a,t)dt 

J C J a J c J a J c 

rd rb 

= / (F 2 (b,t)- F 2 (a,t))dt- / {F 1 (t,d)-F 1 (t,c))dt. (4.4.6) 

J c J a 

Now, by the Fundamental Theorem of Calculus, for a fixed value of t, 

J ^F 2 (x,t)dx = F 2 (b,t) - F 2 {a,t) (4.4.7) 



and 



J ^F 1 (t,y)dy = F 1 (t,d)-F 1 (t,c). (4.4.8) 
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Thus, combining (4.4.7) and (4.4.8) with (4.4.6), we have 

rb a rb rd 



rd rb q rb rd q 

= J C Ja T X F ^ y)dXdy - ] a J c Q-y^^^ 

= r/ 6 (^ F2(x ' y) -i Fi(x ' y) )^ y - (4A9) 



If we let p = F\(x, y), q = ^(x, y), and 91? = C (a common notation for the boundary of 
D), then we may rewrite (4.4.9) as 



This is Green's theorem for a rectangle 
Example If D = [1,3] x [2,5], then 



pix + qiv = Jj D (| -|)**. (4.4.10) 



xydx + xciy = / / ( — x — — xy ] clxcly 
,;/> J Jd 

5 



\Tx X -dy Xy ) 
(1 — x)dydx 

= J 3(1 — x)c?x 



3^ i x 

II o 



= -6. 

Clearly, this is simpler than evaluating the line integral directly. 
Green's theorem for regions of Type III 

Green's theorem holds for more general regions than rectangles. We will confine ourselves 
here to discussing regions known as regions of Type III, but it is not hard to generalize to 
regions which may be subdivided into regions of this type (for an example, see Problem 
12). Recall from Section 3.6 that we say a region D in M 2 is of Type I if there exist real 
numbers a < b and continuous functions a : R — > R and /3 : R — > R such that 



D = {(x, y) : a < x < b, a(x) <y< (3(x)}. (4.4.11) 

I 2 is of Type II if there 
functions 7 : R -> R and S : R ->■ R such that 



We say a region D in R 2 is of Type II if there exist real numbers c and d and continuous 



D = {{x,y) :c<y< d } ^{y) <x< S(y)}. 



(4.4.12) 
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^ 








D 














a b 



Figure 4.4.2 Decomposing the boundary of a region of Type I 

Definition We call a region D in M 2 which is both of Type I and of Type II a region of 
Type III. 

Example In Section 3.6, we saw that the triangle T with vertices at (0,0), (1,0), and 
(1,1) and the closed disk 

D = 5 2 ((0,0),l) = {(x,y) :x 2 + y 2 < 1} 

are of both Type I and Type II. Thus T and D are regions of Type III. We also saw that 
the region E beneath the graph of y = x 2 and above the interval [—1,1] is of Type I, but 
not of Type II. Hence E is not of Type III. 

Example Any closed rectangle in M 2 is a region of Type III, as is any closed region 
bounded by an ellipse. 

Now suppose D is a region of Type III and dD is the boundary of D, that is, the 
curve enclosing D, oriented counterclockwise. Let F : M 2 — > M 2 be a C 1 vector field, with 
coordinate functions p = Fi(x,y) and q = i*2(x,y). We will first prove that 




(4.4.13) 



Since D is, in particular, a region of Type I, there exist continuous functions a and (3 such 
that 

D = {(x, y) : a < x < b, a(x) < y < /3(x)}. (4.4.14) 

In addition, we will assume that a and f3 are both differentiable (without this assumption 
the line integral of F along dD would not be defined). As with the rectangle in the previous 
proof, we may decompose dD into four curves, C\, C2, C3, and C4, as shown in Figure 
4.4.2. Then 

<P!(t) = {t,a(t)), 
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a < t < b, is a smooth parametrization of C\ , 

Mt) = (b,t), 

a(b) <t< (3(b), is a smooth parametrization of C2, 

<p 3 (t) = (t,P(t)), 
a < t < b, is a smooth parametrization of — C3, and 

<p 4 (t) = (a,t), 

a(a) <t< /3(a), is a smooth parametrization of — C4. Now 

/ pdx = / pdx + / pdx — / pdx — / pdx, (4.4.15) 
JdD JC1 Jc 2 J-c 3 J-C4 



where 



and 



Hence 



p rb pb 

/ pdx= (Fi(t,a(*)),0) • (l,a'(t))dt = / F^t, a(t))dt, (4.4.16) 

J C\ Ja Ja 

/ pdx= (F 1 (b,t),0)-(0,l)dt= / Odt = 0, (4.4.17) 

7c 2 Va(6) Ja(b) 

I pdx= I (F 1 (t,P(t)),0)-(l,P'(t))dt= I F 1 {t„S{t))dt, (4.4.18) 

J —Cs J a J a 

/ pdx= (Fi(a,t),0) • (0,l)dt= / Ocft = 0. (4.4.19) 

J — Ca J a(a) Ja(a) 



I pdx= Fi(t,a(t))dt- / Fi(t,fi(t))dt 

JdD Ja Ja 

= - [ (F 1 (t, /3(t)) - F 1 (t, a(t)))dt. (4.4.20) 

J a 



Now, by the Fundamental Theorem of Calculus, 

'«(*) 

and so 

-6 5 



/ —F 1 (t,y)dy = F 1 (t,P(t))-Fi{t^(t)), (4.4.21) 



/ p dx = - / — F 1 (t,y)dydt 

JdD Ja Ja(t) °y 

Fi(x,y)dydx 



a(t) 
b rP{x) Q 

a Ja{x) 9y 

^ dxdy. (4.4.22) 

d d y 
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A similar calculation, treating D as a region of Type II, shows that 

qdy = [ [ ^ dxd v- ( 4 - 4 - 23 ) 

dD J JD OX 

(You are asked to verify this in Problem 7.) Putting (4.4.22) and (4.4.23) together, we 
have 



Idi 



F ■ ds = I pdx + qdy = — [ [ tt- dxdy + [ [ tt~ dxdy 



dD JdD J JD 



dy J J D ox 

-//.(e-I)** 



Green's Theorem Suppose D is a region of Type III, dD is the boundary of D with 
counterclockwise orientation, and the curves describing dD are differentiable. Let F : 
M 2 — > M 2 be a C 1 vector field, with coordinate functions p = Fi(x,y) and q = F2(x,y). 
Then 

pdx + qdy = [ [ (^--^f]dxdy. (4.4.25) 



/ : 

JdD 



D \dx dy 



Example Let D be the region bounded by the triangle with vertices at (0, 0), (2, 0), and 
(0, 3), as shown in Figure 4.4.3. If we orient dD in the counterclockwise direction, then 

j (3x 2 + y)dx + hxdy = j j ^-^(5x) - -^(3x 2 + y)^j dxdy 

= J J (5 - l)dxdy 



= 4 / / dxdy 

= (4) (3) 
= 12, 

where we have used the fact that the area of D is 3 to evaluate the double integral. 

The line integral in the previous example reduced to finding the area of the region 
D. This can be exploited in the reverse direction to compute the area of a region. For 
example, given a region D with area A and boundary dD, it follows from Green's theorem 
that 

A= dxdy = / pdx + qdy (4.4.26) 

J JD JdD 



for any choice of p and q which have the property that 
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0.5 1 1.5 2 2.5 
Figure 4.4.3 A triangle with counterclockwise orientation 



For example, letting p = and q = x, we have 



A = I xdy 

1 3D 



(4.4.28) 



and, letting p = —y and q = 0, we have 



A = — I ydx. 
1 3D 



(4.4.29) 



The next example illustrates using the average of (4.4.28) and (4.4.29) to find A: 



A = — ( I xdy — I ydx\ = — j xdy — ydx. 

2 \JdD J 3D ) 2 Jg D 



(4.4.30) 



Example Let A be the area of the region D bounded by the ellipse with equation 

— + 2/2 = 1 
a 2 b 2 

where a > and b > 0, as shown in Figure AAA. Since we may parametrize dD, with 
counterclockwise orientation, by 



ip{t) = (acos(i), 6sin(t)), 
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a 


—a\ 






-b 



2 2 

x y 

Figure 4.4.4 The ellipse — =- -\ — ~ = 1 with counterclockwise orientation 

a 2 b 2 



< t < 2n, we have 



A = - J xdy — ydx 



1 

2 
1 

2 

T 

y 

7ra6. 



9D 
2vr 



(— fesin(t), acos(i)) • (— asin(t), bcos(t)dt 



2tt 



(a6sin 2 (t) + ab cos 2 (t))dt 



2- 



dt 



(2k) 



Problems 



1. Let D be the closed rectangle in M 2 with vertices at (0,0), (2,0), (2,4), and (0,4), 
with boundary dD oriented counterclockwise. Use Green's theorem to evaluate the 
following line integrals. 

(a) / 2 xvix + 3x *iy (b) / y ix + xdy 

JdD JdD 

2. Let D be the triangle in M 2 with vertices at (0, 0), (2, 0), and (0, 4), with boundary dD 
oriented counterclockwise. Use Green's theorem to evaluate the following line integrals. 



(a) / WW*** 

JdD 

(c) / ydx — xdy 

JdD 



(b) / ydx + xdy 

JdD 
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3. Use Green's theorem to find the area of a circle of radius r. 

4. Use Green's theorem to find the area of the region D enclosed by the hypocycloid 

2 2 2 

x 3 + y 3 = 3 ? 

where a > 0. Note that we may parametrize this curve using 

ip(t) = ( a cos 3 (t), a sin 3 (*)), 

< t < 2vr. 

5. Use Green's theorem to find the area of the region enclosed by one "petal" of the curve 
parametrized by 

<p(t) = (sin(2t)cos(t),sin(2t)sin(i)). 

6. Find the area of the region enclosed by the cardioid parametrized by 

(p{t) = ((2 + cos(t)) cos(t), (2 + cos(t)) sin(t)), 

< t < 2vr. 

7. Verify (4.4.23), thus completing the proof of Green's theorem. 

8. Suppose the vector field F : M 2 — > M 2 with coordinate functions p = Fi(x,y) and 
q = F2(x, y) is C 1 on an open set containing the Type III region D. Moreover, suppose 
F is the gradie 
(a) Show that 



F is the gradient of a scalar function / : M 2 — > 



dq _dp 
dx dy 



for all points (x, y) in D. 
(b) Use Green's theorem to show that 



pdx + qdy = 0, 

dD 

where dD is the boundary of D with counterclockwise orientation. 
9. How many ways do you know to calculate the area of a circle? 

10. Who was George Green? 

11. Explain how Green's theorem is a generalization of the Fundamental Theorem of In- 
tegral Calculus. 

12. Let b > a, let C\ be the circle of radius b centered at the origin, and let C2 be the 
circle of radius a centered at the origin. If D is the annular region between C\ and 
C2 and F is a C 1 vector field with coordinate functions p = Fi(x,y) and q = ^(x, y), 
show that 

/ / (fx ~ ly) dxdy = j P dx + 1 d y + j P dx + qdy, 

where C\ is oriented in the counterclockwise direction and C2 is oriented in the clock- 
wise direction. (Hint: Decompose D into Type III regions D\, £>2 7 -^3, and D4, each 
with boundary oriented counterclockwise, as shown in Figure 4.4.5.) 



